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ABSTRACT 

The  Set  Covering  Problem  (SCP)  and  the  Set  Partitioning  Problem  (SPP) 
represent  an  important  class  of  all-binary  (0-1)  Integer  Linear  Programs 
(ILP).  A  review  of  the  literature  reveals  extensive  application  of  the 
SPP/SCP  model  to  a  wide  set  of  practical  problems.  The  basic  model  is 
explained,  and  then  many  of  the  actual  applications  of  this  powerful 
model  discovered  in  the  literature  review  are  discussed.  The  problems 
derived  from  these  applications  are  difficult  to  solve  with  any  method, 
and  are  particularly  difficult  to  solve  with  optimal  or  exact  algorithms. 
Various  solution  techniques  are  investigated  within  the  framework  of  the 
classical  simplex  method  with  branch  and  bound  enumeration.  Several 
reformulations  of  the  SPP/SCP  as  Integer  Generalized  Networks  are  examined 
Extensive  computational  results  are  reported  for  several  "real  world" 
large-scale  problems,  and  a  convenient,  compact  format  for  data  input 
is  proposed  as  a  standard  for  this  problem  class. 
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I.   INTRODUCTION 

The  Set  Covering  Problem  (SCP)  and  the  Set  Partitioning  Problem  (SPP) 
represent  an  important  class  of  all-binary  (0-1)  Integer  Linear  Programs 
(ILP's).  These  problems  have  binary  variables,  binary  constraint 
coefficients  and  unit  or  integer  resources. 

The  basic  SPP/SCP  model  has  been  known  for  over  25  years.   It  is 
enticing  in  formulation  and  deceptively  simple.  A  review  of  the  open 
literature  reveals  extensive  application  of  the  SPP/SCP  model  to  a  wide 
range  of  problems,  including  airline  crew  scheduling,  vehicle  routing, 
and  facilities  location.  Even  though  the  model  has  been  intensively 
studied  for  both  its  intriguing  binary  structure  and  its  potential  for 
practical  application,  exact  solution  technologies  for  large-scale 
problems  were  not  evident  until  the  work  of  such  researchers  as  Marsten 
[Ref.  1]  began  to  appear  in  the  early  1970' s.  Other  early  contributors 
are  listed  by  Christofides  [Ref.  2]. 

After  first  defining  the  basic  model  and  discussing  many  of  its 
applications,  several  reformulations  of  the  SPP/SCP  will  be  examined. 
Glover  and  Mulvey  [Ref.  3]  have  presented  two  reformulations  of  the 
binary  ILP  as  an  Integer  Generalized  Network  (GNIP).  There  is  very 
little  computational  evidence  in  the  literature  concerning  these  reformu- 
lations; therefore,  the  computational  behavior  of  this  approach  will  be 
tested  and  results  reported  for  several  problems.   Another  reformulation 
of  the  SSP/SCP  as  an  Integer  Generalized  Processing  Network  (a  network 
with  special  side  constraints)  will  be  examined  and  its  potential  evaluated 
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As  indicated  by  the  many  reformulations,  manipulative  techniques,  and 
heuristic  methods  appearing  in  the  literature,  these  problems  are  diffi- 
cult to  solve  reliably  with  any  method,  and  are  particularly  difficult  to 
solve  with  optimal  or  exact  algorithms.  Various  solution  techniques 
based  on  the  classical  simplex  method  with  branch  and  bound  enumeration 
are  investigated  in  this  study.  Some  of  the  techniques  examined  are 
basis  factorization,  elastic  programming,  enumeration  schemes,  network  and 
linear  programming  relaxation,  logical  reduction,  and  heuristic  methods 
for  obtaining  starting  solutions.  Extensive  computational  results  are 
reported  for  several  "real  world"  large-scale  problems,  and  a  convenient, 
compact  format  for  data  input  is  proposed  as  a  standard  for  this  problem 
class. 
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II.   THE  BASIC  MODEL  AND  MODEL  GENERATION 

A.   THE  BASIC  MODEL 

The  SCP  formulated  as  an  ILP  is  of  the  form: 

n 

(1)  MIN    £  C-X.. 

J     J-l  J  J 

n 

(2)  s.t.    £  a.-X,   '  b,      i  =  1,  ...,  m 

j=l   J  J  " 

(3)  X.  e  (0,1)    j  =  1,  ...,  n 

(4)  C.     10  j  =  1,  ...,  n 

(5)  b   _>  0  and  integer  i  =  1,  ...,  m 

{1  if  column  j  covers  row  i, 
0  otherwise. 
A  minimal  cost  set  of  columns  must  be  selected  from  X.  such  that 
the  magnitudes  of  the  right-hand  sides  (RHS),  b.,  are  "covered"  or 
satisfied.   If  (2)  and  (5)  are  replaced  by 

n 
(?)    £  a^X-  =1,    1  =  1,  ...,  m, 

j  =  l   J  ° 

we  have  the  SPP  (sometimes  referred   to  in  the  literature  as  the  equality 
constrained  SCP).  This  restriction  of  the  SCP  exhibits  sufficient 
modelling  and  computational  interest  to  be  studied  in  its  own  right.  For 
the  SPP,  the  rows  (i)  represent  a  set  which  must  be  partitioned  by  a 
combination  of  mutually  exclusive  columns  at  minimum  cost. 
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B.  SIDE  CONDITIONS 

Many  practical  applications  of  the  SPP/SCP  formulation  add  logical 
constraints  to  the  basic  model  discussed  above.  For  example,  suppose 
there  are  p  sets  of  columns  Sk,  k  =  1,  . . . ,  p  in  the  model  and  only  one 
column  from  each  of  the  sets  Sk  is  eligible  to  be  included  in  the  final 
solution.  This  restriction  will  produce  constraints 

(8)  £   X.  =  1  for  all  k. 

In  another  case,  suppose  that  the  solution  must  include  exactly  J  columns. 
This  results  in  the  cardinality  constraint 

(9)  £  X  =  J 

j   J 

being  appended  to  the  basic  model.  Introductory  modelling  texts  such  as 
Wagner  [Ref.  4]  and  Gaver  and  Thompson  [Ref.  5]  discuss  many  such  logical 
conditions  formulated  with  binary  variables.  Any  or  all  of  these  logical 
conditions  can  be  included  to  extend  the  basic  model  for  the  purpose 
required. 

C.  COLUMN  GENERATION 

The  art  of  formulating  the  practical  SPP/SCP  lies  in  the  schemes  used 
for  column  generation.  It  is  possible,  of  course,  to  generate  all  2m  -  1 
columns  capable  of  covering  or  partitioning  the  rows,  but  for  any  relatively 
large  number  of  rows,  the  problem  becomes  intractable.  This  "all  possible 
combinations"  formulation  is  known  as  the  Complete  SPP/SCP,  and  even 
though  techniques  are  emerging  for  attacking  such  problems  [Ref.  6], 
efforts  must  be  made  to  keep  the  number  of  permissible  columns  within 
the  capabilities  of  the  optimizer  being  used.  Editing  reductions  can  be 
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realized  by  incorporating  such  conditions  as  managerial  specifications  of 
operating  policy;  dimensional  restrictions  on  time,  distance,  and  space; 
legal  restrictions;  labor  union  restrictions;  cash  flow  restrictions; 
environmental  restrictions;  and  as  many  other  "real  world"  constraints  as 
can  be  included  in  the  column  generation  process. 

Incorporating  such  conditions  into  the  column  generator  can  handle 
most,  if  not  all,  side  conditions  and  feasibility  issues  without  including 
them  as  extensions  of  the  basic  SPP/SCP  model.  Some  examples  are  described 
by  Marsten  and  Muller  [Ref.  7],  Shanker,  Turner,  and  Zoltners  [Ref.  8], 
and  Cullen,  Jarvis,  and  Ratliff  [Ref.  9]. 

0.  THE  OBJECTIVE  FUNCTION 

The  cost  coefficients  C-  for  the  basic  model  can  be  of  two  types: 
physical  and  ordinal.  A  physical  cost  is  a  coefficient  in  units  of 
dollars,  miles,  time,  etc.,  and  represents  the  cost  of  covering  certain 
rows  with  column  j.  The  associated  physical  objective  function  expresses 
the  cost  of  covering  or  partitioning  the  set  represented  by  the  rows. 

It  is  quite  often  the  case,  though,  that  the  cost  coefficients  serve 
only  as  a  means  of  distinguishing  between  alternate  columns.  In  many 
political  and  social  models,  for  example,  a  column  will  be  assigned  a 
subjective  number  depicting  some  measure  of  acceptability  (or  unacceot- 
ability)  thus  effecting  an  ordinal  ranking  structure  in  the  objective 
function.  The  objective  then  becomes  a  matter  of  selecting  those  columns 
which  minimize  the  ordinal ity.  A  much-used  special  case  of  the  ordinal 
cost  structure  is  the  unit-cost  objective  function  in  which  C-  =  1  for 
all  j.  The  optimal  solution  for  the  unit-cost  objective  function  is  the 
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minimum  number  of  columns  capable  of  covering  or  partitioning  the  row  set 
without  regard  to  physical  cost  or  ordinal  ranking. 
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III.  APPLICATIONS  AND  PROBLEM  DESCRIPTIONS 

A.  APPLICATIONS 

Set  Covering  and  Set  Partitioning  Problems  have  been  studied  extensively 
because  of  their  many  practical  applications.  The  surveys  by  Garfinkel 
and  Nemhauser  [Ref.  10]  and  Balas  and  Padberg  [Ref.  11]  list  many  useful 
applications  which  have  appeared  in  the  literature.  Some  of  these  are 
listed  below,  along  with  a  few  which  have  subsequently  appeared. 

APPLICATION  REFERENCE  (*Unsighted) 

1.  Truck  Deliveries  [Ref.  12],*  [Ref.  13],*  [Ref.  14]*, 

[Ref.  15],  [Ref.  16]*. 

2.  Tanker  Routing  [Ref.  17]. 

3.  Aircrew  Scheduling  [Ref.  18],*  [Ref.  19],*  [Ref.  20],* 

[Ref.  21],*  [Ref.  22].  [Ref.  23], 
[Ref.  7]. 

4.  Facilities  Location  [Ref.  24],  [Ref.  25],*  [Ref.  26], 

[Ref.  27],  [Ref.  28]. 

5.  Air  Fleet  Scheduling  [Ref.  29],*  [Ref.  7]. 

6.  List  Selection  [Ref.  30], 

7.  Political  Districting  [Ref.  31],*  [Ref.  32],*  [Ref.  33]. 

3.  Nuclear  and  Conventional        (See  ADpendix  A). 
Targeting 

9.  Information  Retrieval  [Ref.  34],*  [Ref.  35].* 

10.  Symbolic  Logic  [Ref.  36].* 

11.  Switching  Theory  [Ref.  37],  [Ref.  38],*  [^ef .  39],* 

[Ref.  40].* 

12.  Stock  Cutting  or  Trimming       [Ref.  41].* 

13.  Line  Balancing  [Ref.  42].* 
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APPLICATION  REFERENCE 

14.  Capacity  Balancing  [Ref.  43].* 

15.  PERT-CPM  [Ref.  36].* 

16.  Frequency  Allocation  [Ref.  44]. 

17.  Tracking  Problems  [Ref.  45]. 

18.  Vehicle  Routing  [Ref.  9]. 

19.  Sales  Territory  Design  [Ref.  8]. 

20.  Coloring  Problems  [Ref.  46],*  [Ref.  47],  [Ref.  48].* 

21.  Disconnecting  Paths  in  a  Graph   [Ref.  49],  [Ref.  50]. 

22.  Cyclic  Scheduling  Problems  [Ref.  51],  [Ref.  52]. 

B.  THE  TRUCK  DELIVERY  PROBLEM 

The  first  application  we  will  examine  is  one  that  appears  quite  often 
in  textbooks  and  is  a  simple  illustration  of  the  basic  model.  This 
problem  will  also  provide  an  example  which  will  be  carried  forward 
through  discussion  in  later  sections. 

Consider  the  problem  of  making  deliveries  to  m  locations  by  truck 
(rail,  aircraft,  ship,  messenger,  etc.).  There  are  n  feasible  routes  to 
choose  from  and  a-.  =  1  if  location  i  is  on  route  j.  A  cost  C.  (say, 
time,  dollars,  miles)  is  assigned  to  route  j.  An  optimal  partition  gives 
a  minimal  cost  routing  that  makes  each  delivery  exactly  once.  An  optimal 
cover  gives  a  minimal  cost  routing  that  makes  sufficient  deliveries  to 
satisfy  the  demand  at  each  location.  The  optimal  solution  to  the  unit- 
cost  problem  yields  the  minimum  number  of  trucks  necessary  to  make  the 
required  del iveries. 

Table  1  is  the  explicit  tableau  for  an  illustrative  example  of  the 
SPP.  The  flight  scheduler  for  a  small  West  Coast  air  freight  comoany  has 
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TABLE  1.  AIR  FREIGHT  EXAMPLE 


Rl    R2    R3    R4    R5    R6    R7 


RHS 


Los  Angeles 
San  Francisco 
San  Jose 
Denver 
Portland 
Seattle 
San  Diego 

Costs 


1 

0 

0 

0 

0 

0 

0 

B      1             | 

1 

1 

1 

0 

0 

0 

0 

=  1     ! 

1 

1 

1 

0 

0 

0 

0 

s     1         | 

0 

0 

1 

1 

1 

0 

0 

s     1          | 

0 

0 

0 

1 

1 

0 

0 

=  1      1 

0 

0 

0 

1 

1 

1 

0 

=  1      1 

0 

0 

0 

0 

1 

1 

1 

=  1      1 

0 

0 

0 

0 

6 

7 

4 

ob  j . 

been  assigned  the  task  of  delivering  exactly  one  of  seven  identical 
packages  to  each  of  seven  western  cities  by  tomorrow  morning.  All  of  the 
delivery  points  can  be  reached  in  the  required  time  with  the  current 
schedule  except  San  Diego.  There  are  only  three  feasible  ways  to  make 
the  San  Diego  delivery:  extend  route  5,  extend  route  6,  or  add  a  new 
route  7.  The  cost  of  the  alternatives  is  calculated  and  appears  in  the 
tableau.  By  inspection,  there  are  only  two  feasible  solutions:  (1) 
Routes  1  and  5  at  a  cost  of  6,  and  (2)  Routes  1,  4,  and  7  at  a  cost  of 
4.  The  minimum  cost  solution,  therefore,  is  to  add  flight  7  to  the 
current  schedule.  The  unit-cost  solution  or  minimum  partition  is  to  use 
solution  (1). 

Two  large-scale,  real -life  problems  of  this  type  have  been  examined 
in  this  study:  TRUCK  and  TANKER.  TRUCK  is  a  nationwide,  intercity  truck 
routing  problem,  a  large  SCP,  and  fits  the  basic  description  above. 
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TANKER  is  a  worldwide  oil  tanker  fleet  scheduling  problem  which  extends 
the  basic  SPP  model  to  help  choose  between  company-owned  and  charter 
tankers  to  meet  refinery  delivery  requirements  from  available  loading 
volumes  and  origins.  Each  cargo,  company-owned  ship,  and  potential 
charter  vessel  is  represented  by  a  row.  Cargoes  must  be  carried,  and 
ships  must  either  be  used,  or  scheduled  in  demurrage.  Each  column 
represents  a  feasible  route  for  a  particular  ship;  during  the  planning 
horizon,  it  may  carry  zero,  or  more  cargoes.  The  cost  of  each  route  may 
be  calculated  ordinal ly  (based  on  fleet  size)  or  economically  (based  on 
operating  costs) . 

The  problem  dimensions  for  TRUCK  and  TANKER  are  listed  in  Table  2. 
NZEL  is  the  total  number  of  non-zero  elements,  and  NCE  is  the  average 
number  of  non-zero  elements  per  column. 

TABLE  2.  TRUCK  DELIVERY  PROBLEM  DIMENSIONS 


ROWS 


COLUMNS 


NZEL 


NCE 


MODEL 


TRUCK 

239 

4752 

30075 

8.0 

SCP   | 

TANKER 

166 

7563 

31289 

4.1 

SPP 

C.  THE  AIRCREW  SCHEDULING  PROBLEM 

An  airline  has  a  set  of  m  flight  legs,  each  of  which  requires  a  crew. 
Given  the  airline's  timetable,  a  set  of  n  possible  crew  rotations  can  be 
generated.  Each  crew  rotation  is  a  sequence  of  scheduled  flight  segments 
constituting  a  roundtrip  (a  sequence  departing  from  and  returning  to  one 
of  the  airline's  crew  bases).  A  cost  is  calculated  for  each  rotation, 
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and  once  a  complete  set  of  flyable  rotations  has  been  generated,  the 
problem  is  to  select  an  optimal,  feasible  subset. 

Depending  on  whether  or  not  crew  members  are  allowed  to  be  passengers 
on  certain  flights,  optimal  covers  or  optimal  partitions  yield  optimal 
schedules.  "Deadheading"  is  the  practice  of  allowing  a  crew  to  travel  as 
passengers  on  certain  flights.  Planned  deadheading  can  be  accommodated 
with  the  partitioning  model.   If  rotation  j  concludes  a  planned  deadhead 
on  flight  segment  i,  then  a.,  is  set  to  zero  rather  than  one.  If 
unplanned  deadheading  is  allowed,  however,  then  a  covering  problem  must 
be  solved. 

A  typical  side  condition  common  to  these  models  is  the  Crew  Base 
Constraint  of  the  form 

£    H.X.  =  M_  for  s  =  1,  ...,  S 
JeDs    J  J 

where  H-  is  the  number  of  flying  hours,  per  month,  associated  with 

J 

rotation  j;  M  is  the  maximum  number  of  flying  hours  available  per 
month  at  crew  base  s;  and  D  is  the  set  of  rotations  flown  out  of  crew 

base  s. 

All  of  the  problems  of  this  type  were  provided  by  Professor  Roy  E.  Marsten, 
University  of  Arizona,  and  are  described  in  [Ref.  23].  TIGER1  and  TIGER2 
are  examples  of  crew  scheduling  problems  generated  for  Flying  Tiger 
Airlines.  AMERICAN  is  a  large  crew  scheduling  problem  generated  by 
American  Airlines.   (All  of  the  airline  problems  in  this  study  were  solved 
without  crew  base  constraints.)  The  last  problem  in  this  class  is  3US,  a 
driver-scheduling  problem  generated  for  Helsinki  City  Transport  as 
described  in  [Ref.  23]. 
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TABLE  3.  AIRLINE  CREW  SCHEDULING  PROBLEM  DIMENSIONS 
ROWS     COLUMNS     NZEL     NCE     MODEL 


TIGER1 

160 

636 

4134 

6.5 

SPP 

TIGER2 

107 

2188 

8266 

3.8 

SPP 

AMERICAN 

95 

9318 

57293 

6.1 

SPP 

BUS 

56 

530 

3365 

6.3 

SPP 

D.  THE  MAXIMAL  SET  COVERING  PROBLEM 

Either  the  facilities  location  problem  or  the  list  selection  problem 

can  be  formulated  as  a  Maximal  Set  Covering  Problem.  This  problem 

differs  from  the  basic  SCP  because  we  no  longer  require  that  all  rows  be 

covered,  rather  the  objective  is  to  cover  as  many  rows  as  possible 

subject  to  various  constraints.  To  accomplish  this,  m  continuous  variables, 

Y.,  are  added  to  the  basic  model  to  produce  the  following  Mixed  Integer 

Program  (MIP) : 

m 
Min   £    Yi 

i    i  =  l 


S.T. 


E   -uXj  +  Y^l    1-1. 


. . ,  m 


(9) 


(10) 


j  =  l   J 


=  J 


j=i   J  J 


0  <_  Y -   _<  1    i  =  l,  ...,m 
X-  e  (0,1)     j  =  1,  ...,  n. 

J 
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The  constraint  (9)  limits  the  number  of  rows  which  may  be  covered  by 

specifying  that  only  J  columns  can  be  used.  The  constraint  (10)  is  a 

budget  constraint  which  specifies  that  as  many  rows  as  possible  be 

covered  for  B  dollars.  The  sense  of  the  objective  function  here  is  to 

minimize  the  number  of  rows  left  uncovered. 

Another  formulation  in  the  same  spirit  replaces  the  objective  function 

by  the  familiar  Min  £  C.X.,  and  adds  the  constraint 

j   J  J 

m 
(11)    £   Y.  :  M. 
i  =  l   1  ~ 

This  formulation  seeks  the  minimum  cost  set  of  columns  which  leaves  at 

most  M  rows  uncovered. 

Dwyer  and  Evans  [Ref.  30]  have  applied  a  similar  formulation  to  the 
"list  selection  problem."  The  list  selection  problem  selects  a  set  of 
subscriber  lists  which  maximizes  the  proportion  of  customers  reachable 
with  direct  mail  pieces.  The  rows  correspond  to  magazine  subscribers, 
and  the  columns  to  individual  magazines.  Let  a--  =  1  if  individual  i 
subscribes  to  magazine  j,  and  zero  otherwise. 

Moore  and  Revelle  [Ref.  28]  have  applied  this  formulation  to  a 
hierarchical  facilities  location  problem.  The  rows  represent  demand 
points  and  the  columns  represent  various  location  strategies.  The  objective 
is  to  pick  those  strategies  which  cover  as  much  of  the  demand  as  oossible. 

STEINER1  and  STEINER2  are  two  computationally  difficult  set  covering 
problems  published  by  Fulkerson,  Nemhauser,  and  Trotter  [Ref.  53].  Each 
row  has  exactly  three  non-zero  elements,  b^  =  1  for  all  i,  and  the 
objective  function  is  of  the  unit-cost  type.  These  basic  SCP's  were 
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extended  to  MC0VER1  and  MC0VER2  respectively,  in  order  to  evaluate  the 
difficulty  of  the  Maximal  Covering  Problem. 

TABLE  4.  MAXIMAL  COVERING  PROBLEM  DIMENSIONS 
ROWS     COLUMNS     NZEL     NCE     MODEL 


STEINER1 

117 

27 

351 

13 

SCP 

MC0VER1 

118 

144 

495 

13 

SCP(ext) 

STEINER2 

330 

45 

990 

22 

SCP     | 

MC0VER2 

331 

375 

1365 

22 

SCP(ext) 

Only  the  budget  constraint  (10)  was  added  to  produce  the  extended 
problems.  The  value  of  B  was  set  so  that  MC0VER1  seeks  the  same 
optimal  solution  as  STEINER1,  and  MC0VER2  seeks  the  same  optimal  solution 
as  STEINER2. 
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IV.  COMPUTATIONAL  DIFFICULTIES 

A.  THE  BRANCH  AND  BOUND  ENUMERATION  METHOD 

It  is  evident  that  the  SCP/SPP  is  a  powerful  model  with  many  useful 
applications.  Unfortunately,  it  is  also  true  that  the  large-scale 
SCP/SPP  is  difficult  to  solve  to  optimal ity.  In  fact,  Karp  [Ref.  54]  has 
shown  the  set  covering  problem  to  be  an  NP-hard  combinatorial  problem. 
The  solution  techniques  investigated  here  involve  simplex-based  enumeration, 
often  called  branch  and  bound. 

Branch  and  bound  is  an  enumerative  method  that  has  been  used  success- 
fully to  optimize  a  variety  of  combinatorial  problems.  The  basic  prin- 
ciple is  to  methodically  search  the  set  of  possible  integer  solutions 
in  such  a  way  that  not  all  possibilities  need  be  explicitly  considered. 
The  theoretical  framework  for  this  study  is  provided  in  the  following, 
which  has  been  adapted  from  Geoffrion  and  Marsten  [Ref.  10].  The  procedure 
of  branch  and  bound  is  described  in  terms  of  three  concepts:  separation, 
relaxation,  and  fathoming. 

1.  Separation 

For  any  optimization  problem  (P),  let  F(P)  denote  its  set  of 
feasible  solutions.  Problem  (P)  is  said  to  be  separated  into  subproblems 
if  the  following  conditions  hold: 

1.  Every  feasible  solution  of  (P)  is  a  feasible  solution  of  exactly 
one  of  the  subproblems. 

2.  A  feasible  solution  of  any  of  the  subproblems  is  a  solution 
of  (P). 
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The  procedure  is  to  first  make  a  reasonable  effort  to  solve  (P). 
If  this  effort  is  unsuccessful,  separate  (P)  into  two  subproblems, 
thereby  initiating  what  will  be  called  a  candidate  list  of  subproblems. 
A  reasonable  representation  of  the  candidate  list  may  be  an  "enumeration 
tree"  which  reveals  the  partial  ordering  of  consideration  among  candidates 
in  the  list.  Extract  one  of  the  subproblems  from  the  list  and  call  it 
the  current  candidate  problem  (CP).   If  (CP)  can  be  solved,  extract  a  new 
candidate  problem  from  the  list;  otherwise,  separate  (CP)  and  add  its 
"descendants"  to  the  candidate  list.  Continue  in  this  fashion  until  the 
candidate  list  is  exhausted  (i.e.,  every  branch  of  the  enumeration  tree 
has  been  examined).   If  we  refer  to  the  best  solution  found  so  far  to  any 
candidate  problem  as  the  current  incumbent,  then  the  final  incumbent  must 
obviously  be  an  optimal  solution  of  (P). 

The  technique  of  separation  involves  "branching"  on  a  single 
integer  variable.  For  the  SPP/SCP  where  X.  is  declared  to  be  a  binary 
variable,  the  ILP  can  be  separated  into  two  subproblems  by  means  of  the 
mutually  exclusive  and  exhaustive  restrictions  X.  =  0  or  X.  =  1.  An 
enumeration  tree  may  be  visualized  with  a  vertex  associated  with  each 
separation  and  an  edge  with  each  restriction.  The  tree  predecessor 
relationship  among  vertices  reveals  the  ordering  among  separations  and 
their  associated  restrictions.  This  enumeration  tree  provides  a  visually 
appealing  illustration  of  the  solution  sequence. 
2.  Relaxation 

Any  constrained  optimization  problem  (P)  can  be  "relaxed"  by 
loosening  its  constraints,  resulting  in  a  new  problem  (PR).  By  far  the 
most  popular  type  of  relaxation  for  the  ILP  is  to  replace  the  integrality 
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restriction  on  the  variables  of  (P)  by  simple  bounds  on  the  variables, 
producing  the  continuous  problem  (PR).  The  only  requirement  for  (PR) 
to  be  a  valid  relaxation  is  that  F(P)  cF(PR).  For  the  minimization 
problem,  this  definition  implies: 

1.  If  (PR)  has  no  feasible  solutions,  then  the  same  is  true  of  (P). 

2.  The  minimal  value  of  (P)  is  no  less  than  the  minimal  value  of  (PD). 

K 

3.  If  an  optimal  solution  of  (P  )  is  feasible  in  (P),  then  it  is  an 
optimal  solution  of  (P). 

In  selecting  between  the  alternative  types  of  relaxation  for  a 
given  problem,  there  are  two  main  criteria  to  be  considered.  First,  it 
is  desirable  for  the  relaxed  problem  to  be  significantly  easier  to  solve 
than  the  original.  Second,  one  would  like  (PR)  to  yield  an  optimal 
solution  of  (P),  or,  failing  that,  the  minimal  value  of  (PR)  should  be 
as  close  as  possible  to  that  of  (P).  The  distance  between  the  minimal 
values  of  (PR)  and  (P)  is  often  described  as  the  "gap,"  and  is  used  as 
a  measure  of  the  "strength"  (small  gap)  or  "weakness"  (large  gap)  of  the 
relaxation  (PR).  Unfortunately,  the  objectives  that  (PR)  be  both 
"strong"  and  easy  to  solve  are  antagonistic.  In  general,  the  easier  (PR) 
is  to  solve,  the  greater  the  "gap"  is  between  the  original  and  relaxed 
problems. 

3.  Fathoming 

Let  (CP)  be  a  typical  candidate  problem  arising  from  the  attempt 
to  solve  (P).  The  ultimate  objective  in  dealing  with  (CP)  is  to  determine 
whether  its  feasible  region  F(CP)  may  possibly  contain  an  optimal  solution 
of  (P),  and  if  it  does,  to  find  it.  If  it  can  be  ascertained  by  some 
means  that  F(CP)  cannot  contain  a  feasible  solution  better  than  the 
incumbent  (the  best  feasible  solution  yet  found),  this  is  certainly 
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good  enough  to  dismiss  (CP)  from  further  consideration,  and  we  say  that 
(CP)  has  been  fathomed.  If  an  optimal  solution  of  (CP)  can  actually  be 
found,  we  also  say  that  (CP)  has  been  fathomed.  In  either  case,  the 
candidate  problem  has  been  entirely  resolved  for  purposes  of  enumeration, 
and  no  further  separations  of  (CP)  are  necessary.  Thus,  the  subproblems 
which  would  arise  as  restricted  descendants  of  (CP)  have  been  enumerated 
implicitly  by  either  the  bounding  argument  or  the  feasibility  argument. 
Candidate  problem  (CP)  is  fathomed  if  any  one  of  these  criteria 
is  satisfied: 

1.  An  analysis  of  (CPR)  reveals  that  (CP)  has  no  feasible  solution. 

2.  An  analysis  of  (CPR)  reveals  that  (CP)  has  no  feasible  solution 
better  than  the  incumbent. 

3.  An  analysis  of  (CPR)  reveals  an  optimal  solution  of  (CP);  e.g.,  an 
optimal  solution  of  (CPR)  is  found  which  happens  to  be  feasible  in  (CP) 

4.  General  Tree  Search  Procedure 

STEP  1:  Initialize  the  candidate  list  with  the  ILP.  Set  the  incumbent 
value,  Z*,  equal  to  infinity. 

STEP  2:  STOP  if  the  candidate  list  is  empty.  If  there  exists  an  incumbent 
then  it  must  be  optimal  in  the  ILP,  otherwise  ILP  has  no  feasible 
solution. 

STEP  3:  Select  a  candidate  problem  (CP)  from  the  list  and  solve  its 
relaxation  (CPR). 

STEP  4:  Fathoming  Criterion  1.  If  the  outcome  of  STEP  3  reveals  (CP) 
to  be  infeasible,  go  to  STEP  2. 

STEP  5:  Fathoming  Criterion  2.  If  the  outcome  of  STEP  3  reveals  (CP) 
has  no  feasible  solution  better  than  the  incumbent,  Z*,  go  to 
STEP  2. 

STEP  6:  Fathoming  Criterion  3.  If  the  outcome  of  STEP  3  reveals  an 
optimal  solution  of  (CP),  go  to  STEP  3. 

STEP  7:  Separate  (CP)  and  add  its  descendants  to  the  candidate  list. 
Go  to  STEP  2. 


27 


STEP  8:  A  feasible  solution  of  ILP  has  been  found.  If  the  value  of  the 
(CP)  is  less  than  Z*,  record  this  solution  as  the  new  incumbent 
and  set  Z*  =  value  of  (CP).  Go  to  STEP  2. 

The  degrees  of  freedom  in  STEP  3  provide  a  host  of  options. 
Critical  among  these  is  the  selection  mechanism  for  branch  variables.  A 
good  branching  strategy  makes  it  possible  to  avoid  searching  large 
portions  of  the  enumeration  tree,  thus  greatly  reducing  time  spent  in  the 
enumeration  process.  STEP  3  can  also  be  prohibitively  expensive  if  the 
solution  of  (CPn)  is  not  easy  to  generate  from  the  solution  of  (CP). 
This  step  requires  either  storage  of  many  (CP)  solutions  or  a  restriction 
in  the  sequence  for  branch  solutions.  For  instance,  "fixed  order  enumera- 
tion" permits  branching  only  on  the  last  element  in  the  candidate  list 
previously  associated  with  a  (CP). 

Most  successful  implementations  of  this  general  scheme  use  the 
solution  of  the  LP  relaxation  of  the  ILP  to  obtain  the  bounds  required 
for  the  branch  and  bound  enumeration.  Christofides  [Ref.  2]  and  Marsten 
[Ref.  1]  report  that  most  of  the  current  large-scale  algorithms  use  LP  to 
obtain  bounds  for  their  various  enumeration  procedures.  Exceptions  are 
Etcheberry  [Ref.  55],  who  uses  "Lagrangian  Relaxation"  to  obtain  the  bounds, 
and  Glover  and  Mulvey  [Ref.  3],  who  reformulate  and  use  Generalized 
Network  Relaxations. 

The  success  of  the  branch  and  bound  scheme  deoends  on  good 
branching  strategies  and  the  ability  to  obtain  good  bounds  efficiently 
during  the  tree  search.  Typically,  many  LP's  must  be  solved  and  even 
though  the  integer  requirement  is  relaxed  for  each  LP  restriction,  there 
are  two  serious  problems  associated  with  these  LP's  which  make  them  hard 
to  solve:  numerical  instability  and  degeneracy. 
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B.  NUMERICAL  INSTABILITY 

The  concept  of  a  basis  for  the  linear  program  must  first  be  discussed 
in  order  to  explain  the  computational  difficulties.  Consider  the  system 
of  equalities  AX  =  b  where  X  is  an  n-vector,  b  an  m-vector,  and  A  is  an 
m   x  n   matrix  (m  <_  n) .  From  the  n   columns  of  A,  we  select  a  set  of  m 
linearly  independent  columns  and  denote  the  m   x  m   matrix  determined  by 
these  columns  by  B.  The  matrix  B  is  then  non-singular  and  we  may  uniquely 
solve  the  equations  BXg  =  b  for  the  m-vector  Xg,  namely,  Xg  =  B-1b. 
If  all  n   -  m   components  of  X  not  associated  with  columns  of  B  are  set 
to  equal  zero,  the  solution  to  the  resulting  set  of  equations  is  said  to 
be  a  basic  solution  to  AX  =  b  with  respect  to  the  basis  B.  B  is  called  a 
basis  since  its  m   linearly  independent  columns  span  the  space  Em . 

It  can  be  seen  from  the  above  explanation  that  the  transformation  by 
the  basis  inverse  is  necessary  to  obtain  a  basic  solution.  It  is  also 
true  that  all  large-scale  LP  systems  available  today  require  some  form  of 
representation  of  this  basis  inverse  transformation  in  order  to  function 
efficiently.  One  popular  representation  is  the  Product  Form  of  the 
Inverse  described  by  Orchard-Hays  [Ref.  56].  Another  example  is  the 
explicit  sub-kernel  representation  described  by  Graves  [Ref.  57]. 

Unfortunately,  the  columns  of  the  SPP/SCP  are  often  nearly   linearly 
dependent.  For  instance,  a  route  generator  will  produce  a  base  route  to, 
say,  five  locations.  By  substituting  alternate  locations  one  at  a  time 
into  the  base  route,  many  routes  are  generated  which  differ  by  only  one 
or  two  elements.  This  can  produce  an  ill-conditioned  basis  whose  inverse 
can  contain  numbers  so  large,  or  so  small  that  after  a  few  iterations 
with  real  arithmetic,  the  computer  is  unable  to  maintain  sufficient 
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significance  to  provide  the  numerical  stability  necessary  for  the  LP 
algorithm  to  converge,  or  if  it  does  converge,  to  produce  the  true 
optimum. 

To  attempt  to  overcome  the  numerical  instability,  it  is  necessary  to 
"clean  up"  the  representation  of  the  inverse  by  a  process  known  as 
reinversion.  There  are  many  different  reinversion  schemes  available,  but 
in  essence,  they  all  use  the  original  problem  data  to  generate  a  new 
representation  of  the  inverse  which  is  relatively  free  from  accumulated 
round-off  error.  Reinversion  is  computationally  expensive,  and  for  the 
SPP/SCP  it  is  often  necessary  to  reinvert  quite  frequently,  thus  slowing 
the  computation  of  the  bounds  needed  by  the  enumeration  scheme. 

C.  DEGENERACY 

The  primal  simplex  algorithm  for  the  solution  of  the  LP  proceeds  from 
one  basic  feasible  solution  of  the  constraint  set  of  a  problem  to  another 
in  such  a  way  as  to  continually  decrease  the  value  of  the  objective 
function  until  a  minimum  is  reached.  If  one  or  more  of  the  basic  variables 
in  a  basic  solution  has  value  zero,  that  solution  is  said  to  be  a  degenerate 
basic  solution.  For  the  SPP/SCP,  it  is  important  to  note  that  seeking 
the  minimum  number  of  columns  capable  of  covering  or  partitioning  the  row 
set  is  exactly  equivalent  to  maximizing   the  primal  degeneracy  present  in 
the  optimal  basic  solution. 

"Pivoting"  is  the  name  applied  to  the  procedure  which  accomplishes  a 
basis  exchange.  A  degenerate  basis  exchange  is  one  in  which  a  column 
leaves  the  basis,  a  new  column  enters  the  basis,  and  the  value  of  the 
objective  function  does  not  change.  A  degenerate  pivot,  then,  exhibits 
the  undesirable  property  that  the  basis  exchange  uses  up  computation  time, 
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but  no  overt  improvement  in  the  objective  function  is  realized.  (We  will 
ignore  for  the  moment  that  we  face  a  serious  theoretical  dilemma  in 
demonstrating  that  the  simplex  method  is  finite  in  the  presence  of 
degeneracy.)  Every  pivot  involves  the  update  of  the  basis  inverse 
representation;  therefore,  each  update  usually  introduces  round-off 
error.  As  discussed  earlier,  the  basis  inverse  transformation  for  an 
ill-conditioned  basis  accumulates  round-off  error  yery   quickly.  In  the 
presence  of  massive  degeneracy,  then,  it  is  possible  for  the  convergence 
of  the  primal  simplex  algorithm  to  be  prohibitively  slow,  because  an 
excessive  amount  of  time  is  spent  making  degenerate  basis  exchanqes  and 
performing  reinversion. 

Degeneracy  and  round-off  error  can  also  produce  a  very  serious 
phenomenon  called  "cycling."  It  is  possible  that  a  repeating  sequence  of 
degenerate  basic  solutions  will  be  generated  such  that  the  simplex 
algorithm  cycles  endlessly  without  making  progress.  Most  LP  systems 
ignore  the  threat  of  cycling,  because  the  repeating  sequence  is  usually 
broken  after  reinversion  "randomly"  permutes  the  row  order,  thus  evoking 
a  new  solution  trajectory.   If,  however,  significant  time  is  spent  in  a 
cycle  while  waiting  for  reinversion  to  be  triggered  by  the  pivot  count, 
the  internal  clock,  or  by  a  check  on  the  rounding  error,  rapid  solution 
of  the  LP  will  not  be  possible. 

It  has  been  determined  that  degeneracy  and  consequent  cycling  are 
significant  obstacles  for  the  efficient  solution  of   the  LP  relaxation  of 
the  SPP/SCP  with  most  of  the  available  LP  systems.  Massive  orimal 
degeneracy  is  present  as  a  consequence  of  the  binary  coefficients  and  the 
fact  that  for  most  SPP/SCP1 s,  the  right-hand  sides  of  each  row  are 
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identical.   It  is  this  massive  primal  degeneracy  which  led  Marsten  to 
suggest  the  use  of  a  dual  algorithm  for  the  solution  of  the  LP  [Ref.  1]. 
For  the  unit-cost  objective  function,  a  similar  dual  degeneracy  can  also 
be  present.  Although  in  most  problems  with  general  costs,  the  dual 
degeneracy  is  less  severe  than  the  primal  degeneracy,  even  objective 
functions  with  varying  costs  can  produce  an  effective  degeneracy  due  to 
round-off  error.  The  problem  called  TRUCK  exhibits  these  characteristics 
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V.  REFORMULATIONS 

A.  THE  NEED  FOR  REFORMULATION 

The  LP  relaxation  of  the  SPP/SCP  can  be  numerically  troublesome. 
One  way  to  avoid  this  difficulty  is  to  seek  another  relaxation  which  may 
be  easier  to  solve.  The  alternate  relaxations  examined  here  are  based  on 
networks.  Reformulation  of  the  SPP/SCP  as  a  network  makes  it  possible  to 
exploit  an  efficient  solution  technology.  Network  codes  such  as  GNET 
[Ref.  58],  and  GENNET  [Ref.  59]  use  basis  handling  procedures  which 
require  very  little  real  arithmetic,  thus  avoiding  much  of  the  round-off 
error  problem.  Reformulation  comes  at  the  cost  of  making  the  problem 
larger,  but  it  is  hoped  that  the  superior  speed  and  numerical  stability 
of  the  network  approach  will  more  than  make  up  for  the  increase  in 
problem  size. 

B.  THE  FIRST  GENERALIZED  NETWORK  REFORMULATION 

Glover,  Hultz,  Klingman,  and  Stutz  [Ref.  60]  have  offered  an  interesting 
reformulation  of  any  all-binary  integer  problem  as  an  integer  generalized 
network.  The  Generalized  Network  formulation  is  attractive  because  of 
the  emergence  of  some  ^ery   fast  computer  codes  for  solving  generalized 
networks.  Glover,  et  al . ,  report  that  their  code,  NetG,  is  up  to  50  times 
faster  than  state-of-the-art  commercial  LP  codes  on  continuous  network 
problems.  GENNET  has  proved  to  be  comparable  in  solution  speed  for  the 
same  class  of  problems.  This  reformulation,  then,  (which  we  will  call 
GNIP-1)  offers  some  promise  for  the  solution  of  the  SPP/SCP.  It  also 
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provides  a  way  of  describing  the  SPP/SCP  in  network  terms  which  can  make 
it  easier  to  formulate  and  explain  the  model. 
The  SPP 


(1)   MIN 
J 


(3) 
(4) 


(5) 


L  C  X 

j  =  l  J  J 


(2)   s.t.   £  a-.X.  =  1 


j=l 


i  =  1 ,  . . . ,  m 


Xj  e  {0,1} 


j  =  1 ,  . . . ,  n 

j  =  1,  . . . ,  n 
1  if  column  j  covers  row  i, 
0  otherwise 
reformulated  as  a  Generalized  Network  becomes  the  GNIP-1 


cj  >° 
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S.T. 
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Xk      >° 


z    - \\  +    z 

k:head  j  k:tail   j 


\  ■  0        J  -  1,   ....  n 


(12) 


k:head   i     K 


\  -  1       i  -  1, 


,  m 


Xk     £    {0,1} 
0  <  Yk  <  1    . 

For  the  SCP,  (12)  would  be  replaced  by   £    Yk  >_  b-. 

k:head  i 

The  procedure  for  drawing  the  network  flow  diagram  is  given  below. 

Given  a  SPP(SCP)  with  m  rows,  n  columns,  and  NCE(j)  non-zero  elements 

per  column, 
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1.  Create  a  node  i  for  each  constraint,  i  =  1,  . ..,  m;  and  give  each 
node  a  demand  of  1  (SCP  _>  b.). 

2.  Create  a  node  j  for  each  variable,  j  =  1,  ...,  n;  where  demand  = 
supply  =  0. 

3.  Create  a  super  source  node  S  and  give  it  a  supply  >_  0. 

4.  Create  a  generalized  arc  Xk  (S,j)  for  each  original  variable, 

a.  Assign  arc  Xk  a  cost  of  C,. 

b.  Designate  arc  Xk  as  an  integer  (0  -  1)  arc. 

c.  Give  arc  Xk  a  multiplier  Mk  equal  to  NCE(j). 

5.  Create  a  pure  network  arc  Yk  (j,i)  for  each  non-zero  element  in 
co 1 umn  j  . 

a.  Assign  arc  Yk  a  cost  of  zero. 

b.  Assign  arc  Yk  an  upper  bound  of  one  and  a  lower  bound  of  zero. 

The  GNIP-1  reformulation  of  the  Air  Freight  Example  Problem  is  displayed 

in  Figure  1. 

It  can  be  seen  in  Figure  1  that  if  the  flow  on  a  generalized  arc  X.  , 

3  3  krhead  j 

is  zero,  the  flows  on  the  continuous  arcs  Yk  emanating  from  the  variable 
node  j  are  also  zero.  It  is  also  clear  that  if  the  flow  on  the  generalized 
arc  is  1,  a  flow  of  M,  arrives  at  the  variable  node,  forcing  a  flow  of 
1  on  each  continuous  arc  incident  to  that  node. 

The  telling  disadvantage  of  the  GN  reformulation  is  the  weakness  of 
its  continuous  relaxation.  When  the  integer  restriction  is  removed  for 
the  arcs  X.  ,  there  is  no  assurance  that  an  integral  flow  will  arrive  at 
the  variable  node.  So  far,  this  is  comparable  to  the  LP  relaxation  of 
the  integer  variable  X..  The  difference  between  the  LP  relaxation  and 
GN  relaxation  lies  in  the  continuous  arcs  Yk  emanating  from  the  variable 
node.  Given  a  fractional  supply,  there  is  no  assurance  that  the  flows  on 
each  continuous  arc  will  be  the  same.  If  the  flows  were  identical,  the 
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ARC   PARAMETERS 

(Ck,  Mk) 
*  s  integer  (0,1) 


DEMAND 


m       1 


=       1 


VARIABLE    NODES  (j)  CONSTRAINT   NODES  (i) 


Figure  i.  GNIP-1  Reformulation  of  the  Air  Freiqht  Example 
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GN  relaxation  would  be  the  same  as  the  LP  relaxation.  Unfortunately, 
empirical  evidence  suggests  that  the  flows  differ  widely.  Furthermore, 
the  problem  increases  as  a  function  of  the  number  of  non-zero  elements 
per  column  in  the  original  IIP.  Srinivasan  and  Thompson  [Ref.  61]  report 
a  practical  upper  bound  of  three  or  four  non-zeros  per  column  for  a 
similar  reformulation  of  set  partitioning  problems. 

Glover  and  Mulvey  [Ref.  3]  state  that  it  is  legitimate  to  manipulate 
the  costs  incident  to  a  given  variable  node  provided  that  these  costs 
always  sum  to  C..  This  can  be  interpreted  as  a  form  of  "Lagrangian" 
manipulation,  taking  side  constraints  into  the  objective  function,  where 
these  side  constraints  stipulate  that  the  flow  on  each  pure  network  arc 
incident  to  each  variable  node  be  the  same.  By  linear  programming 
duality,  there  exists  some  such  assignment  of  costs  for  which  the  ODtimum 
objective  function  value  for  the  GN  is  the  same  as  that  for  the  LP 
relaxation.  Obviously,  the  trick  is  to  find  an  exact  or   heuristic 
procedure  of  assigning  these  costs  to  "strengthen"  the  GN  relaxation. 
Several  attempts  were  made  to  distribute  costs  according  to  the  proportion 
of  flows  on  the  first  and  subsequent  GN  relaxations,  but  these  attempts 
proved  ineffective. 

C.  THE  SECOND  GENERALIZED  NETWORK  REFORMULATION 

One  way  to  strengthen  the  GN  relaxation  is  to  reduce  the  number 
of  continuous  arcs  in  the  reformulation.  Glover  and  Mulvey  [Ref.  3] 
have  presented  another  GN  formulation  for  the  ILP  (GNIP-2)  which  elimin- 
ates the  super  source  node,  the  n  generalized  arcs  emanating  from  it,  and 
one  continuous  arc  per  variable  node.  For  the  ILP  with  m  rows,  n  columns, 
and  NCE(j)  non-zero  elements  per  column, 
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1.  Create  a  node  i  for  each  constraint,  i  =  1,  ...,  m;  and  give  each 
a  supply  of  one  (SCP  >_  b.). 

2.  Create  a  node  j  for  each  variable,  j  =  1,  ...,  n;  where  demand  = 
supply  =  0. 

3.  Create  an  arc  (i,j)  for  each  non-zero  element  in  the  ILP,  connecting 
each  constraint  node  to  the  appropriate  variable  node. 

a.  Select  one  arc  for  each  j  and  designate  it  as  an  integer  (0-1) 
generalized  arc  X.  . 

(1)  Assign  arc  X.  a  cost  of  C. 

(2)  Assign  arc  x£  a  multiplier1  of  Mk=  NCE(j)  -  1 

(if  NCE  is  greater  than  one). 

b.  Designate  the  remaining  arcs  as  continuous  generalized  arcs  Y^. 

(1)  Assign  arc  Y.  an  upper  bound  of  1. 

(2)  Assign  arc  Y.  a  cost  of  zero. 

(3)  Assign  arc  Yk  a  multiplier  of  Mk  =  -1. 

c.  If  NCE(j)  =  1 

(1)  Create  a  slack  node  S  with  a  supply  <_  M. 

(2)  Create  a  continuous  arc  Y.  (S,j)  as  in  3b  above. 

The  GNIP-2  reformulation  of  the  Air  Freight  Example  is  displayed  in 
Figure  2. 

The  above  procedure  produces  the  following  mathematical  program  GNIP-2 

MIN    £  CkXk 


s.t.       L   .  Vk*     E.  .  -h  =  °>  i'1-  ■■■>"' 

k:head  j       k:head  j 

(13)  L  _  Xk  +     £  .  \   ■  1.    i  "   1.  ■—  ^ 

k:tail  l       k:tail  l 

Xk  e  (0,1) 

0  <  Yk  <  1 

For  the  SCP,  (13)  would  be  replaced  by   £    Yk  +    L  +  Yk  -  bi 

k:tail  i     k:tail  i 
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(Ck,  Mk) 


(0,2)* 


DEN  =    1 


(6,3)* 


PORT         =    1 


*    =  Integer 


Figure  2.  GNIP-2  Reformulation  of  the  Air  Freight  Example 
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Table  5  shows  the  relative  strengths  of  the  GN  and  LP  relaxations, 
plus  relevant  problem  dimensions  for  a  few  of  the  test  problems.  The 
column  labeled  TIME  OF  LP/GN  displays  the  time  required  to  obtain  the 
first  solution  of  the  LP/GN  relaxations,  given  in  IBM  3033  CPU  seconds. 
A  special  version  of  GENNET  [Ref.  59]  was  used  to  obtain  the  GN  relaxations, 
and  the  current  version  of  the  X  System  [Ref.  62]  was  used  for  the  LP 
relaxations.  The  GNIP-1  relaxations  for  a  few  problems  were  obtained 
with  the  X  System  so  comparative  times  will  not  be  given.  %0PT  is  the 
value  of  the  continuous  relaxation  multiplied  by  100  and  then  divided  by 
the  value  of  the  optimal  integer  solution.  The  closer  %0PT  is  to  100, 
the  better  the  relaxation. 

STEINER1A  and  STEINER2A  are  two  SCP's  created  by  transposing  the 
coefficient  matrices  of  STEINER1  and  STEINER2  problems.  These  problems 
were  constructed  because  no  real  problems  were  available  having  fewer 
than  four  non-zero  elements  per  column.  The  GN  relaxations  have  been 
reported  to  be  non-competitive  with  the  LP  relaxations  when  the  number  of 
non-zeros  per  column  exceeds  three  or  four  [Refs.  61].  The  results  from 
this  study  indicate  that  the  strength  of  the  GN  relaxation  is  unpredictable. 
The  GN  relaxations  for  the  symmetrical  STEINER  problems  are  very  competi- 
tive with  the  LP  relaxations.  The  GN  relaxation  for  BUS,  however,  was 
unexpectedly  bad.  The  gap  for  BUS  is  so  great  that  there  is  little  hope 
of  a  reasonable  solution  trajectory  in  the  enumeration  phase.  Even  the 
relatively  easy  problems,  TIGER1  and  TIGER2,  produced  GN  relaxations  of 
poor  quality. 
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TABLE  5.  GENERALIZED  NETWORK  RELAXATIONS  FOR  SELECTED  PROBLEMS 


MODEL 


ROWS/ 
NODES 


COLS/ 
ARCS 


AVG 
NCE 


TIME  OF 
LP/GN 


%OPT 


STEINER1A 


STEINER2A 


STEINER1 


STEINER2 


3US 


TIGER1 


TIGER2 


SCP 
I  GNIP-2 

27 
144 

117 
351 

3.0 

0.28 
0.09 

100.0 
100.0 

SCP 
GNIP-2 

45 
375 

330 
990 

3.0 

1.14  ! 
0.19 

100.0 
100.0 

SCP 
GNIP-2 

117 
144 

27 
351 

13.0 

0.09 
0.36 

50.0 
49.6 

SCP 
|  GNIP-2 

330 
375 

45 
990 

22.0 

0.58 
1.44 

50.0 
49.1 

SPP 
I  GNIP-1 
GNIP-2 

56 

587 
587 

530 
3894 
3372 

6.3 

8.18 
1.18 

32.8 
0.0014  | 
0.0023 

SPP 
I  GNIP-1 
GNIP-2 

160 
797 
797 

636 

4769 
4133 

6.7 

0.94 
I  2.11 

100.0 
62.7 
72.9 

SPP 
|  GNIP-2 

107 
!   2296 

2188 
!  3292 

1 

4.3 
2.0 

9.46 
I  3.90 

i 

100.0 
I  46.3 

I         i 

D.  THE  GENERALIZED  PROCESS  NETWORK  REFORMULATION 

Some  recent  work  by  Koene  [Ref.  63]  on  Processing  Networks  offers  an 
interesting  perspective  for  improving  the  GNIP  formulation.  A  processing 
network  is  one  for  which  the  flows  on  arcs  going  out  from  (or  into)  a 
given  node  are  proportional  to  each  other.  To  achieve  this  proportionality 
(in  our  case,  we  desire  equal  flows),  a  network  with  side  constraints 
must  be  solved.  Several  authors  have  reported  some  success  with  solving 
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pure  and  generalized  networks  with  a  "few"  complicating  constraints 
[Refs.  64,  65,  66].  Unfortunately,  the  number  of  side  constraints  in 
this  case  grows  again  with  the  number  of  non-zero  elements  in  each  column 
of  the  ILP.  In  fact,  so  many  side  constraints  are  needed  in  these 
problems  that  the  size  of  the  basis  inverse  representation  which  must  be 
carried  along  with  the  network  becomes  prohibitive. 

Because  the  side  constraint  portion  is  so  large,  it  was  decided  to 
view  the  generalized  process  network  problem  as  a  candidate  for  either 
generalized  upper  bounding  factorization  or  network  factorization  routines 
embedded  within  an  LP  system.  Generalized  Upper  Bounding  (GUB)  refers  to 
a  set  of  rows  with  at  most  one  non-zero  in  each  column.  The  coefficients 
of  each  non-zero  must  be  +1  or  -1  (or  capable  of  being  scaled  to  +1,  or  -1) 
Network  factorization  refers  to  a  set  of  rows  with  at  most  two  non-zeros 
in  each  column,  and  the  non-zeros  may  be  of  any  value  or  sign.  Brown  and 
Wright  [Ref.  67]  have  examined  techniques  for  extracting  network  structures 
embedded  in  general  LP  problems,  but  the  test  bed  for  network  factorization 
routines  coupled  with  an  Integer  Programming  System  is  not  yet  in  place, 
so  no  results  can  be  reported  at  this  time.  A  Generalized  Upper  Bounding 
factorization  routine  was  available,  however,  and  a  formulation  of  the 
GNIP-2  process  network  scaled  for  GUB  appears  below. 


(14)   MIN  CkXt< 

k 


(15)   S.T.    £    X  +    £    -Yk  =  0,  j  =  1,  ...,  n;  (GUB) 
k:head  j     k:head  j 


(16)  E         h  +         £         Vk  =  °'         *  =  l*    •'•'  ^; 

k:tail    i  k:tail    i 
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(17)      Yk     -   Yk,      =0,     i  =  1,  ...,  n;  k  i   k'; 
k:tail  i    k':tail  i 

where\:head  j  =  1/Mk:head  j  ■ 

(14)  is  the  same  as  for  the  GNIP-2.  (15)  forms  a  GUB  set,  the  right-hand 
side  of  (16)  would  be  replaced  by  _>b.  for  the  SCP,  and  (17)  is  the 
side  constraint  section. 

The  number  of  GUB  rows  attainable  is  equal  to  the  number  of  variable 
nodes  in  the  GNIP.  This  is  an  enormous  GUB  set,  but  even  with  the  GUB 
rows  not  considered,  the  problem  is  still  larger  than  the  original  ILP. 
It  was  predicted  that  the  basis  inverse  representation  obtained  from  this 
factorization  would  not  be  as  ill-conditioned  as  the  representation 
obtained  from  the  normal  LP  bases.  However,  the  LP's  with  GUB  factoriza- 
tion are  also  very  difficult  to  solve  and  the  results  do  not  indicate 
that  this  promises  to  be  a  competitive  approach. 
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VI.  COMPUTATIONAL  EXPERIENCE 

A.  THE  COMPUTER  CODE 

The  computer  code  used  for  this  research  is  a  large-scale  optimization 
test  bed  called  the  X  System  or  simply  XS.  XS  has  been  developed  since 
1974  [Ref.  62]  as  a  general-purpose  optimization  system  of  advanced 
design  which  serves  both  as  a  prototype  test  bed  for  research  and  as  the 
fundamental  computational  foundation  of  many  application  packages  utilizing 
optimization.  XS  is  designed  to  solve  large-scale  optimization  problems, 
with  special  emphasis  on  mixed  integer  models.  The  embedded  linear 
programming  module  has  received  most  of  the  design  effort  and  exhibits 
many  singular  features  including: 

1.  Hyper-sparse  data  representation  [Ref.  68]; 

2.  Complete  constructive  degeneracy  resolution  [Ref.  57]; 

3.  Basis  factorization  [Ref.  69];  and 

4.  Elastic  range  constraints  [Ref.  62]. 

XS  consists  completely  of  open  FORTRAN  subroutines.   FORTRAN  IV  H 
(Extended)  OPTIMIZE  (2)  is  the  implementation  dialect  and  an  IBM  3033  is 
the  host  computer.   All  of  the  problems  with  the  exception  of  AMERICAN 
were  solved  interactively  under  the  IBM  CMS  timesharing  system. 

1.  Elastic  Model 

XS  requires  that  the  model  be  thought  of  in  an  extended  or 
"elastic"  sense.  The  term  elastic  comes  from  the  view  that  no  constraint 
is  totally  binding,  but  may  be  violated  at  a  price,  an  elastic  penalty. 
The  feasible  region  is  thereby  "stretched"  to  the  degree  of  elasticity 
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specified  by  the  penalty  structure.  The  extended  elastic  model  appears 
below. 


Minimize 


S  cjV  S  (p^  +  p^> 


S.t.    R,  -  S" 


R1  -  s: 


<  £  aijXj  <  Rt  +  St, 


Si  >  o,  s|  >  o, 


i  =  1,  ...,  mg; 

i  =  mg  +  1,  ...,  m; 

j  =  1,  ...,  n; 

i  =  1 ,  . . . ,  m ; 


where   C,       :  Cost  Coefficients, 

Constraint  Coefficients, 

Lower  and  Upper  Constraint  Violation  Penalties, 
Lower  and  Upper  Constraint  Range  Limits, 
Logical  "artificial"  and  "surplus"  variables, 
Variables  (any  of  which  may  be  integer), 
Lower  and  Upper  Variable  Bounds, 
Is  0,  or  +1,  or  -1  (GUB  indicators), 
Number  of  GUB  rows, 
Row  Dimension, 
Column  Dimension. 
2.  Hypersparse  Data  Representation 

Appendix  3  exhibits  a  specimen  of  the  data  input  format  used  for 
this  research.  This  particular  form  exploits  the  hypersparse  data 
structure  capability  of  XS  since  there  are  yery   few  unique  real  numbers 
in  a  SPP/SCP.  All  of  the  constraint  coefficients  are  1  and  0;  therefore, 


P~.  and  P* 
RT  and  R* 
ST  and  S* 


L.  and  U- 

mg 

m 

n 


45 


it  is  not  necessary  to  store  a  real  number  for  each  non-zero  coefficient, 
but  merely  its  address.  Further  efficiencies  can  be  realized  in  a  similar 
manner  for  the  unit-cost  objective  function,  and  for  the  right-hand  side 
for  which  b.  =  1  for  all  i. 

3.  Primal  Dual  Algorithm  and  Degeneracy  Resolution 

The  representation  of  the  Basis  Inverse  maintained  by  XS  admits 
the  application  of  the  Primal  or  Dual  Algorithm  with  equal  facility. 
This  fact  makes  it  possible  to  use  the  Dual  Algorithm  for  this  problem 
class  with  no  loss  of  performance  with  respect  to  the  Primal.  As  mentioned 
earlier,  the  Dual  is  the  algorithm  of  choice  because  of  the  massive 
primal  degeneracy  present  in  this  class  of  problems. 

It  is  this  equal  facility  between  the  Primal  and  the  Dual  which 
provides  the  framework  for  the  degeneracy  resolution  machinery  in  XS.  In 
Graves'  terminology  [Ref.  57],  when  degeneracy  is  encountered,  the 
algorithm  is  said  to  be  "blocked."  The  resolution  of  blocking  in  either 
the  primal  or  dual  algorithm  is  accomplished  by  shifting  to  the  alternate 
algorithm  when  blocking  occurs.  The  alternate  algorithm  is  applied  to  a 
subproblem  of  the  original  problem  and  at  worst  we  are  led  to  a  contracting 
sequence  of  problems  to  which  we  alternately  apply  the  primal  and  dual 
algorithms.  A  strict  contraction  can  be  assured,  and  thus  in  at  most  a 
finite  number  of  steps,  resolution  is  assured.  A  complete  illustration 
of  blocking  resolution  can  be  found  in  [Ref.  70]. 

The  degree  to  which  blocking  is  resolved  through  this  sequential 
nesting  of  subproblems  is  controlled  by  a  blocking  resolution  parameter. 
This  parameter  can  be  set  so  that  any  degree  between  no  resolution  and 
total  resolution  can  be  obtained.  The  parameter  also  controls  the  point 
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at  which  blocking  resolution  begins  in  the  solution  trajectory.  This 
means  that  blocking  resolution  can  be  inhibited  early  in  the  solution 
trajectory  when  it  may  not  be  efficient  to  resolve  eyery   degeneracy,  and 
then  enabled  as  the  trajectory  nears  optimal ity. 

B.  HEURISTICS  WITHIN  THE  EXACT  ALGORITHM 

The  efficiency  of  branch  and  bound  algorithms  can  be  improved  through 
judicious  use  of  heuristic  information  that  indicates  good  branching 
strategies  to  follow.  If  good  feasible  solutions  (incumbents)  can  be 
obtained  early,  then  fathoming  can  occur  more  quickly  and  more  frequently 
in  the  search.  Also,  premature  termination  of  the  algorithm  will  more 
often  result  in  near-optimal  or  optimal  solutions. 

1.  The  Elastic  Heuristic 

A  robust  technique  for  obtaining  an  incumbent  has  been  incorporated 
into  the  XS  enumeration  system.  Any  continuous  relaxation  of  the  Elastic 
ILP  can  be  rounded  to  an  integer  solution  with  very  little  computational 
effort.  Further,  all  such  rounded  solutions  are  admissable  (feasible  in 
the  extended  elastic  sense).  The  current  continuous  solution  is  rounded 
in  three  passes,  each  of  which  selects  variables  from  a  class  defined  in 
terms  of  9,  where  0  <  9  <_  .5  and  (1  -  9)  <_  X .  ■:  1  or  0  \  X  •   9. 

Class  1:  Nearly  integral  (0  <_  9  _;_  .2). 

Class  2:  Fractional      (.2  £9  £.4). 

Class  3:  Ambivalent     (.4  £9  <■  .5) . 
The  rounding  heuristic  sequentially  exhausts  variables  from  each  class 
and  rounds  using  a  "minimal  regret  function,"  rounding  away  from  the 
worst  penalty. 
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In  the  default  elastic  enumeration  scheme,  there  are  only  depth 
and  value-motivated  fathoming  rules.  Feasibility  plays  no  direct  role 
in  the  enumeration  except  via  the  accumulated  cost  of  the  penalties  in 
the  objective  function.  Integer  solutions  and  lower  bounds  of  excellent 
quality  are  empirically  produced  quite  early  in  the  enumeration  effort, 
permitting  routine  early  termination  of  the  search  based  on  an  optimal ity 
tolerance  or  on  a  maximum  depth  (permissible  number  of  fixed  variables  in 
any  restriction).  Tuning  of  the  method  is  easily  accomplished  via  these 
two  limits  and  the  elastic  penalties  used  to  express  the  underlying  model. 

The  penalty  structure  found  most  effective  with  this  heuristic  is 
based  upon  the  number  of  non-zero  row  elements  in  each  row  of  the  constraint 
matrix,  called  NRE(i).  It  has  also  been  determined  that  the  upper  penalty 
P  should  be  set  at  one-half  the  value  of  the  lower  penalty  P",  in 
order  to  coerce  the  heuristic  to  round  up.  This  forces  shallow  termina- 
tion with  respect  to  depth  fathoming.  A  penalty  constant,  P,  is  set  at 
approximately  one  order  of  magnitude  greater  than  the  largest  cost 
coefficient,  so  that  for  each  row  in  the  SPP/SCP 

Set  PT  =  P/NRE(i). 

Set  P+  =  -  PT 
^       i   2  i* 

Penalties  set  in  this  manner  "communicate"  to  the  heuristic 
that  rows  which  can  be  covered  in  only  a  few  ways  are  more  important  than 
rows  with  a  higher  row  count.  In  the  enumeration,  then,  these  "important" 
rows  are  satisfied  first,  making  it  possible  to  avoid  many  of  the  alternate 
possibilities  available  for  covering  rows  with  a  large  row  count. 

Table  6  exhibits  the  computational  characteristics  of  the  Elastic 
Method.  LP  and  LPtime  indicate  the  solution  value  and  solution  time  (in 
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CPU  seconds)  for  the  first  continuous  relaxation  of  the  indicated  problems 
OPT  is  the  optimal  integer  solution,  and  ILPtime  is  the  total  CPU  time  in 
seconds  required  to  solve  the  ILP  to  the  optimum.  INPUT/OUTPUT  time  is 
included  in  the  ILPtime  values.  Input  time  includes  the  time  to  read  in 
the  entire  problem  and  accomplish  error  checking.  Output  time  includes 
the  printing  of  intermediate  information  plus  the  time  required  to 
execute  the  report  writer.  The  dual  algorithm  was  used  for  all  solutions. 

TABLE  6.   COMPUTATIONAL  RESULTS  FOR  THE  ELASTIC  METHOD 

LP         LPtime       OPT       ILPtime 


STEINER1 

9.0 

MC0VER1 

0.0  (9) 

STEINER2 

15.0 

MC0VER2 

0.0  (15) 

TIGER1 

56406.0 

TIGER2 

15098.0 

TIGER2a 

15684.0 

BUS 

50754.5 

AMERICAN 

No  LP  soluti 

TRUCK 

No  LP  soluti 

TANKER 

75941.4 

0.08 

0.05 

0.52 

0.22 

0.94 

9.46 
10.23 

8.18 

ifter  30  minutes  CP 

ifter  30  minutes  CP 

34.05      75941.4 


18 

0  (18) 
30 

0  (30) 
56406 
15098 
15684 
61308 


25.13 

0.39 

527.08 

417.08 

0.97 

9.53 

10.30 

177.44 


44.62 


It  is  interesting  to  note  that  the  maximal  set  covering  reformula- 
tions of  the  two  STEINER  problems  as  MC0VER1  and  MC0VER2  produced  easier 
HP'S.  The  budget  constraints  for  these  problems  were  constructed  so 
that  the  maximal  covering  problems  would  seek  the  same  optima  as  the  SCP 
formulations.  TIGER2a  is  a  variant  of  TIGER2  obtained  by  eliminating 
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some  of  the  columns  found  in  the  optimal  solution  of  TIGER2.  Marsten  has 
indicated  in  [Ref.  23]  that  problems  in  this  general  class  typically  have 
many  nearly  optimal  integer  solutions.  TIGER2a  supports  this  observation: 
8  of  the  33  optimal  columns  for  TIGER2  were  removed  to  construct  TIGER2a, 
and  the  solution  is  only  3.9  percent  worse. 

C.  THE  ELASTIC  METHOD  WITH  STARTING  SOLUTIONS 

As  indicated  in  Table  6,  the  LP  relaxations  for  AMERICAN  and  TRUCK 
were  not  solved  within  30  minutes.  After  many  unsuccessful  attempts  to 
overcome  the  numerical  instabilities  peculiar  to  these  LP's,  various 
combinatorial  and  heuristic  methods  for  obtaining  starting  solutions  were 
considered.  Given  a  feasible,  suboptimal  solution  of  "reasonable"  quality, 
it  should  be  possible  for  the  elastic  method  to  find  the  optimal  solution 
in  few  enough  iterations  to  avoid  many  of  the  numerical  difficulties. 

0.  LOGICAL  REDUCTION 

Garfinkel  and  Nemhauser  [Ref.  71]  have  given  a  set  of  simple  rules 
for  logically  reducing  a  problem  matrix  for  the  SPP/SCP  with  b.  =  1  for 
all  i.  Although  logical  reduction  is  not  guaranteed  to  provide  a  starting 
solution,  substantial  reductions  in  problem  size  can  greatly  improve  the 
numerical  behavior  of  many  problems,  especially  if  the  problems  were 
originally  generated  with  many  inherent  redundancies.  This  explanation 
of  logical  reduction  provides  insight  into  the  structure  of  the  SPP/SCP 
and  is  valuable  for  understanding  and  evaluating  some  of  the  heuristics 
used  for  constructing  starting  solutions. 

Not  all  of  the  rules  were  chosen  for  implementation  because  it  is 
felt  that  their  inclusion  does  not  return  sufficient  reduction  to  justify 
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the  computational  expense  of  using  them.  The  implementation  scheme  used 
is  a  non-backtracking  (polynomial  time)  routine  involving  easy  binary 
comparisons  of  rows  and  columns.   If  significant  reduction  is  achieved 
after  one  application,  the  scheme  is  applied  iteratively  until  no  more 
improvement  is  obtained.  Those  rules  which  were  implemented  are  based  on 
the  notions  of  row  and  column  dominance.  For  example,  if  two  columns  are 
equal  element-by-element  (ColA  =  ColB),  and  one  has  a  cheaper  cost,  then 
the  more  expensive  column  is  dominated  by  the  cheaper  and  may  be  deleted. 
Not  so  obviously,  if  RowA  is  wholly  contained  as  a  subset  of  RowB 
(RowA  <:  RowB),  then  RowB  may  be  deleted,  since  any  column  which  covers 
RowA  will  also  cover  RowB.  The  four  rules  used  are: 

Rule  1:  Delete  all  null  columns  and  null  rows. 

Rule  2:  Column  Dominance. 

A.  SCP:   If  ColA  _>  ColB,  and  CostA  £  CostB,  delete  ColB. 

B.  SPP:   If  ColA  =  ColB,  delete  the  more  expensive  column. 
Rule  3:  Row  Singleton. 

A.  Delete  the  row  covered  by  only  one  column. 

B.  Fix  the  variable  associated  with  the  singleton  to  one. 

C.  Delete  all  rows  covered  by  the  fixed  variable. 

D.  SPP:  Delete  all  columns  in  the  rows  deleted  by  3C. 
Rule  4:  Row  Dominance. 

A.  If  RowA  >  RowB,  delete  RowA. 

B.  SPP:   If  RowA  is  deleted,  also  delete  eyery   column  in  RowA 
which  is  not  included  in  RowB. 

The  column  and  row  reduction  schemes  can  be  applied  iteratively,  since 

after  the  first  application,  additional  dominance  may  be  discovered. 


51 


Consider  the  constraint  matrix  of  the  Air  Freight  Example  below. 
Application  of  the  reduction  rules  would  achieve  the  following  results 


Rl 


R2 


R3 


R4 


R5 


R6 


Los  Angeles 
San  Francisco 
San  Jose 
Denver 
Portland 
Seattle 
San  Diego 

Costs 

For  the  SPP, 

Rule  3:  A.  Delete  Los  Angeles. 

B.  Fix  Rl  to  one  and  delete  it. 

C.  Delete  San  Francisco  and  San  Jose. 

D.  Delete  R2  and  R3. 

Rule  4:  A.  Denver  >  Portland,  delete  Denver 

Seattle  >  Portland,  delete  Seattle 
3.  Delete  R6. 
The  resulting  reduced  problem  is 


R4 


R5 


R7 


Portl and 
San  Diego 

Costs 


R7 


1 

0 

0 

0 

0 

0 

0     1 

1 

1 

1 

0 

0 

0 

0     1 

1 

1 

1 

0 

0 

0 

0     I 

0 

0 

1 

1 

1 

0 

0     1 

0 

0 

0 

1 

1 

0 

0     1 

0 

0 

0 

1 

1 

1 

0     1 

0 

0 

0 

0 

1 

1 

1    1 

0 

0 

0 

0 

6 

7 

4      j 

Solution  =  Rl,  R4,  R7 
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The  action  of  deleting  a  row  or  column  does  not  necessarily  mean 
that  the  row  or  column  has  been  eliminated  from  the  problem.  In  the 
reduced  air  freight  example,  the  removal  of  Denver  and  Seattle  from 
consideration  means  simply  that  any  column  which  covers  Portland  will 
automatically  cover  Denver  and  Seattle;  therefore,  Portland  is  the 
critical  row.  The  same  observation  holds  for  San  Diego.  Table  7  exhibits 
the  degree  of  reduction  achieved  and  the  computation  times  for  two  of  the 
test  problems.  %   RED  is  derived  by  dividing  the  number  of  rows/columns 
deleted  by  the  number  of  original  rows/columns,  respectively.  No  reduction 
was  achieved  for  STEINER1,  STEINER2,  BUS,  TIGER2,  TANKER,  and  TRUCK. 
TIME  indicates  the  total  time  in  CPU  seconds  required  to  achieve  the 
indicated  reduction. 

TABLE  7.  LOGICAL  REDUCTION  RESULTS  FOR  SELECTED  PROBLEMS 


ROWS 


%   RED 


COLS 


%   RED    ITERATIONS    TIME 


TIGER1 


AMERICAN 


160 

50 

636 

13.8 

1 

1  1 
1  27.3  ! 
!                    1 

95 

0 

9318 

47.0 

1 

!  | 
I  1743.0  | 
l                       | 

The  tremendous  column  reduction  achieved  on  AMERICAN  is  due  to  the 
absence  of  crew  base  constraints.  During  the  original  generation  of  this 
problem,  entire  sets  of  columns  were  replicated  and  were  designed  to  be 
kept  distinct  by  the  mutually  exclusive  crew  base  constraints.  Unfortu- 
nately, the  original  crew  base  constraints  are  no  longer  available.  Once 
the  reduction  for  AMERICAN  is  explained,  then,  the  benefits  obtainable 
by  logical  reduction  do  not  appear  to  justify  its  computational  expense. 
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The  reduction  routine  could  be  a  valuable  aid,  however,  in  validating 
the  performance  of  column  generation  programs,  and  could  also  be  of  use 
in  the  first  screening  of  problem  data. 

E.   A  GREEDY  HEURISTIC  ALGORITHM 

Baker  [Ref.  72]  describes  a  heuristic  algorithm  developed  to  exploit 

the  structure  of  large  airline  crew  scheduling  problems  formulated  as  SCP's 

The  approach  is  to  successively  augment  the  solution  set  for  the  SCP  by 

selecting  columns  which  exhibit  the  minimum  average  cost  per  uncovered  row. 

STEP  1:   Initialization.  Solution  set  =  0.  Row  Coverage  Set  =  0. 

STEP  2:  Selection.  Choose  the  column  X*  that  has  the  minimum 
average  cost  per  uncovered  row. 

STEP  3:  Update.  Add  X*  to  the  solution  set.  Update  the  row 

coverage  set  to  reflect  the  rows  covered  by  X"*r.   If  all 
rows  are  covered,  STOP.  Otherwise  GO  TO  STEP  2. 

The  worst  case  bound  for  the  solution  obtained  from  this  procedure  is 

reported  by  Baker  to  be: 

E 
SOLN(Heuristic)  :  SOLN(Opt)   £  1/k  , 

k=l 

where  E  is  the  maximum  number  of  non-zero  elements  in  any  column  in  the 
solution  set.  This  means  that  for  a  set  of  columns  with  from  four  to 
ten  non-zero  elements,  the  worst  solution  obtainable  from  the  heuristic 
is  from  two  to  three  times  larger  than  the  value  of  the  optimum.  Table  3 
indicates,  however,  that  the  actual  performance  of  the  heuristic  can  be 
much  better  than  the  worst  case  bound.  The  column  labeled  START  TIME 
indicates  the  time  required  to  obtain  the  starting  solution.  °S0PT  is 
the  percentage  difference  between  the  starting  solution  and  the  optimal 
solution.  This  simple  heuristic  will  provide  a  classically  feasible 
solution  of  reasonable  quality  for  the  SCP.  The  solution  for  SPP 
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will  not  be  feasible,  since  the  heuristic  will  treat  the  SPP  as  a  SCP  and 
overcovering  of  the  rows  will  result. 

TABLE  8.   STARTING  SOLUTIONS  OBTAINED  FROM  THE  BAKER  SCP  HEURISTIC 


START 

TIME        %  OPT 


STEINER1  0.07  5.6 
STEINER2  0.06  6.7 
TRUCK         8.47         9.19 


Because  these  starting  solutions  are  of  limited  value  for  set  parti- 
tioning problems,  this  line  of  study  was  terminated  in  favor  of  a  new 
solution  technique  developed  by  Brown  and  Graves  [Ref.  73].  The  new 
technique  uses  a  block  partitioning  scheme  to  exploit  the  intrinsic 
structure  of  the  SPP/SCP. 

F.  BLOCK  PARTITIONING  ALGORITHM 

Christofides  [Ref.  3]  describes  a  block  partitioning  structure 
attributed  to  Pierce  [Ref.  15]  which  has  been  used  by  many  researchers  for 
this  problem  class.  To  place  the  SPP/SCP  in  block  form,  we  make  up  m 
blocks  of  columns,  one  block  for  each  row.  Block  i  will  comprise  of 
exactly  those  columns  which  cover  row  i,  but  do  not  cover  rows  1  to  i-1. 
This  produces  a  staircase  matrix  with  zeros  to  the  right  of  the  staircase. 
The  blocks  in  general  can  be  arranged  in  tableau  form  as  shown  in  Figure 
3,  although  one  or  more  blocks  may  be  nonexistent  in  a  particular  problem. 

Marsten  [Ref.  1]  determined  experimentally  that  sorting  the  rows  by 
increasing  length  gave  consistently  good  results  for  his  algorithm  which 
favors  the  shorter  rows  for  early  branching.  The  row  with  the  fewest 
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Block  1 

Block  2 

Block  3 

Block  4 

•  •  • 

Block  m 

Row  1 

11 

0 

0 

0 

etc. 

Row  2 

0  or  1 

111 

Row  3 

0  or  1 

1111 

Row  4 

0  or  1 

1111 

• 
• 
• 

0  or  1 

0    1 

Row  m 

11111  | 

Figure  3.  Block  Partition  Structure 

l's  is  placed  at  the  top,  and  the  row  with  the  most  l's  ends  up  at 
the  bottom.  (This  ordering  by  row  length  is  also  depicted  in  Figure  3.) 
Intuitively,  it  seems  reasonable  that  a  row  which  can  be  covered  in  only 
a  few  ways  is  more  critical  than  a  row  which  can  be  covered  in  many  ways, 
and  should  therefore  be  dealt  with  first.  This  row  orderinq  scheme  was 
chosen  for  implementation. 

Once  the  problem  has  been  placed  into  the  block  structure,  three  ways 
of  ordering  columns  within  blocks  are  found  in  the  literature:  (1) 
heuristically  by  increasing  or  decreasing  cost  [Ref.  3];  (2)  lexicographic- 
ally [Ref.  1];  and  (3)  randomly  (i.e.,  columns  are  not  explicitly  reordered 
once  blocking  has  been  accomplished).  The  algorithm  developed  by  Brown 
and  Graves  does  not  presently  require  that  columns  be  specially  ordered 
within  blocks. 

The  Block  Partitioning  Algorithm  can  be  applied  to  both  the  initial 
LP  Relaxation  and  subsequently  to  the  integer  enumeration.  For  the  LP, 
the  problem  is  first  divided  into  an  arbitrary  number  of  block  groups 
forming  a  set  of  distinct  LP  subproblems.  The  first  LP  subproblem  (LP^) 
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is  solved  to  optimal ity,  the  next  LP  subproblem  (LPJ  is  dynamically 
appended  to  the  first,  and  then  the  two  subproblems  are  solved  as  one. 
The  third  LP  subproblem  is  appended  to  the  solution  of  LP,  «  and  the 
procedure  continues  until  all  subproblems  have  been  appended  and  a  global 
solution  has  been  obtained. 

Many  variations  of  this  procedure  are  evident.  TRUCK  has  been  solved 
by  dual  relaxation  of  the  aggregation  of  successive  LP  solutions. 
Particular  problems  exhibit  great  sensitivity  to  tuning  of  this  procedure, 
In  particular,  a  few  complicating  columns  are  frequently  the  principal 
cause  of  computational  difficulty. 

Table  9  presents  results  for  the  Block  Partitioning  Algorithm  applied 
to  the  LP  only.  Subsequent  to  the  solution  of  the  global  LP,  the  elastic 
enumeration  scheme  was  used  to  obtain  optimal ity.  OPT  TIME  indicates  the 
total  time  required  to  achieve  optimal ity.  I/O  time  is  included  in  the 
values.  All  times  are  in  IBM  3033  CPU  seconds. 

TABLE  9.  RESULTS  FOR  THE  BLOCK  PARTITIONING  ALGORITHM 


NUMBER 

OF 

BLOCK 

NUMBER 

OF 

LP 

GLOBAL 

OPT 

BLOCKS 

TIME 

SUBPROBLEMS 

LP  Time 

TIME 

BUS 

48 

0.001 

4 

10.76 

174.50 

TIGER2 

106 

0.04 

4 

10.98 

15.98 

TRUCK 

194 

0.15 

4 

210.92* 

373.72* 

TANKER 

50 

0.11 

4 

9.14 

22.56 

AMERICAN 

79 

0.24 

4 

132.52 

536.41 

*Primal  Feasible,  Suboptimal  solutions. 
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The  analogous  procedure  whereby  each  subproblem  is  solved  to  an 
integer  optimum  did  not  compare  favorably  with  the  default  elastic 
method.  This  line  of  study  was  therefore  terminated  and  another  concept 
called  the  "Refinement  Procedure"  was  investigated. 

G.  COLUMN  GENERATION  AND  PROBLEM  REFINEMENT 

There  is  no  substitute  for  possessing  the  column  generating  program 
when  attempting  to  solve  the  large-scale  SPP/SCP.  Attempting  to  solve 
large,  static  problems  in  a  vacuum  is  doomed  to  be  either  expensive  or 
impossible.  The  column  generator  and  the  optimization  system  work  best  on 
these  problems  when  they  are  intimately  coupled  so  that  each  module  can 
communicate  with  the  other.  In  this  way,  the  optimizer  works  on  smaller 
problems  and  the  column  generator  produces  only  those  columns  which  can 
contribute  to  a  better  solution. 

Graves  [Ref.  73]  has  developed  a  refinement  procedure  for  the  SPP 

which  attempts  to  capture,  for  a  static  problem,  some  of  the  capabilities 

which  are  present  when  the  column  generator  is  in  hand.  This  procedure 

results  in  a  relaxation  of  the  original  problem,  but  it  is  a  workable 

scheme  which  can  produce  acceptable  solutions.  """he  procedure  is 

implemented  as  follows: 

STEP  1:  Solve  the  SPP  as  a  SCP.  Identify  rows  with  multiple  covers.  If 
no  multiple  covers  exist,  STOP. 

STEP  2:  For  each  column  covering  a  row  which  is  multiply  covered, 

generate  a  new  column  which  does  not  include  rows  with  multiple 
covers.  Original  columns  are  assigned  a  "cost  per  row  covered" 
which  is  used  to  give  new  columns  reduced  costs  proportionate  to 
the  number  of  rows  deleted.  Go  to  STEP  1. 

Table  10  displays  the  performance  of  the  refinement  procedure  on  two  of 

the  more  difficult  SPP's,  3US  and  AMERICAN,  "he  refinement  procedure  ,;s 
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used  in  conjunction  with  the  Block  Partitioning  Algorithm  set  for  four  LP 
subproblems.  #  REFINEMENTS  gives  the  number  of  refinement  iterations,  # 
COLUMNS  GENERATED  gives  the  total  number  of  new  columns  generated  by  the 
procedure,  and  OPT  TIME  is  the  total  time  in  CPU  seconds  required  to 
achieve  the  optimal  partition. 

TABLE  10.   RESULTS  FOR  THE  REFINEMENT  PROCEDURE 

#  REFINEMENTS  #  COLUMNS  GENERATED     OPT  TIME 

BUS  5  123  12.06 

AMERICAN         2  26  94.60 

Comparing  the  results  from  Tables  9  and  10,  it  is  obvious  that  the 
refinement  procedure  produces  a  solution  much  more  quickly  than  the  other 
methods.   It  is  difficult  to  compare  the  solution  values,  because  the 
true  cost  for  each  column  generated  by  the  procedure  is  not  known.  The 
costs  assigned  here  to  the  new  columns  are  representative,  however,  of 
those  which  would  be  assigned  by  the  column  generator. 
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VIII.  CONCLUSIONS  AND  RECOMMENDATIONS 

It  has  been  shown  that  practical,  large-scale  set  covering  and  set 
partitioning  problems  can  be  solved  optimally  and  efficiently.  The  Block 
Partitioning  Algorithm  is  clearly  the  most  robust  and  most  successful 
technique  examined  in  this  study,  and  its  efficiency  compares  favorably 
with  published  solution  technologies  for  this  problem  class. 

Unfortunately,  the  implementation  of  the  Generalized  Network  Reformula- 
tion for  the  SPP/SCP  did  not  perform  as  well  as  expected.  The  continuous 
relaxation  of  the  Integer  Generalized  Network  is  too  weak  to  be  of 
much  practical  use;  therefore,  this  technique  does  not  hold  much  promise 
for  the  rapid  solution  of  set  covering  and  set  partitioning  problems. 

Much  work  remains  in  improving  the  integer  enumeration  scheme  subse- 
quent to  the  solution  of  the  linear  programming  relaxation.  The  default 
elastic  method  works  well,  but  additional  research  is  needed  to  improve 
its  performance.  The  coupling  of  the  column  generating  program  with  the 
optimizer  is  a  concept  which  holds  great  promise  for  the  efficient 
solution  of  problems  in  this  class.  As  illustrated  by  the  Refinement 
Procedure  results,  spectacular  reductions  in  solution  time  can  result 
from  implementing  this  idea. 

The  proposed  standard  data  input  format  displayed  in  Appendix  3  makes 
data  manipulation  both  easy  and  convenient,  and  dramatically  reduces 
storage  requirements  for  any  mathematical  programming  system  capable  of 
exploiting  it.  A  tape  containing  all  of  the  test  problems  in  this  format 
is  available  to  those  who  wish  to  continue  research  in  this  area. 
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APPENDIX  A.   DESCRIPTION  OF  SELECTED  APPLICATIONS 

A.  POLITICAL  DISTRICTING  (SPP) 

Let  the  rows  represent  m  basic  population  units  (such  as  counties, 
census  tracts,  etc.).  Let  the  columns  represent  n  possible  districts  or 
subsets  of  the  population  units  such  that  each  potential  district  meets 
the  requirements  on  population  size,  contiguity,  compactness,  and  so 
forth.  A  side  cardinality  condition  (9)  usually  imposed  is  that  there  be 
exactly  J  districts.   If  C  is  some  ordinal  measure  of  the  unaccept- 
ability  of  district  j,  then  an  optimal  solution  to  the  SPP  yields  an 
optimal  districting  plan. 

B.  COLORING  PROBLEMS  (SPP) 

Consider  the  problem  of  coloring  a  map  so  that  no  two  adjacent  areas 
have  the  same  color.  Let  there  be  m  such  areas.  A  column  j  is  generated 
if  no  two  elements  of  column  j  correspond  to  areas  having  a  common 
boundary.   If  all  costs  are  unity,  an  optimal  partition  indicates  the 
minimum  number  of  colors  needed.  A  direct  application  of  this  concept  is 
the  problem  of  minimizing  the  number  of  distinct  radio  frequencies 
necessary  to  provide  service  in  several  geographical  areas.  A  column  j 
is  generated  if  no  two  elements  of  column  j  correspond  to  areas  with 
overlapping  frequencies. 
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C.  NUCLEAR  AND  CONVENTIONAL  TARGETING 

1.  Conventional  Scenario  (SCP) 

Let  each  row  i  represent  a  target  which  must  be  engaged  at 
least  bi  times.  Let  each  column  j  represent  a  weapons  system  capable 
of  engaging  some  subset  of  the  m  targets  within  a  specified  time  period. 
If  the  cost  coefficients  reflect  the  expected  effectiveness  of  a  given 
weapons  system  on  the  targets  covered  by  column  j,  the  optimal  solution 
will  yield  the  most  effective  subset  of  weapons  systems  capable  of 
accomplishing  the  mission.  If  columns  are  generated  so  that  k  missions 
are  possible  for  each  of  p  weapons  systems,  then  a  constraint  will  be 
necessary  to  ensure  that  each  weapon  system  is  given  only  one  mission  in 
the  optimal  solution.  The  maximal  SCP  formulation  can  also  be  used  here 
to  find  the  combination  of  weapons  systems  which  can  engage  some  specified 
proportion  of  targets. 

2.  Nuclear  Scenario  (SPP) 

Let  each  row  represent  a  target  which  must  be  engaged  only  once 
in  a  given  time  period  (to  avoid  fratricide,  for  instance).  Let  each 
column  j  represent  a  weapons  system  capable  of  engaging  some  subset  of 
the  m  targets  (i.e.,  various  footprint  alternatives).  If  a  unit-cost 
objective  function  is  used,  the  optimal  solution  will  yield  the  minimum 
number  of  weapons  systems  needed  to  destroy  all  the  targets. 

0.   INFORMATION  RETRIEVAL  (SCP) 

Consider  the  problem  of  retrieving  information  from  n  files,  where 
the  jth  file  is  of  length  C.  Suppose  that  m  requests  for  information 
are  received.  Each  unit  of  information  is  stored  in  at  least  one  file 
j  indicated  by  a..  =  1.  An  optimal  cover  yields  a  subset  of  files  that 
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minimizes  the  maximum  total  length  which  needs  to  be  searched  in  order  to 
guarantee  retrieval  of  all  the  information. 

E.  CYCLIC  SCHEDULING  (SCP) 

A  fundamental  problem  of  cyclic  staffing  is  to  size  and  schedule  a 
minimum  cost  workforce  so  that  sufficient  workers  are  on  duty  during  each 
time  period.  The  k,m  cyclic  scheduling  problem  models  the  task  of 
finding  the  minimum  cost  assignment  of  workers  to  shifts  so  that  each 
person  works  k  time  periods  consecutively  out  of  m,  and  at  least  b- 
workers  are  present  during  the  day  i.  A  sample  tableau  for  the  5,7 
cyclic  scheduling  problems  is  shown  below. 


'1 


7 


RHS 


MON 
TUE 
WED 
THU 
FRI 
SAT 
SUN 


1 

1 

1 

1 

y-bl 

0 

1 

1 

1 

1  b2 

0 

0 

1 

1 

y-h 

1 

0 

0 

1 

>b4 

1 

1 

0 

0 

>b5 

1 

1 

1 

0 

>-b6 

1 

1 

1 

1 

>  b, 

0 


0  I'.   xi  I.  ui  and  ^teger  for  all  j 


The  above  formulation  is  not  a  binary  program;  therefore,  to  transform 
it  into  one,  two  alternative  techniques  can  be  used.  If  it  is  desired  to 
distinguish  between  individual  workers,  the  above  seven  columns  can  be 
replicated  for  each  worker.  An  additional  side  constraint  will  be 
necessary  to  ensure  that  a  worker  is  not  selected  to  work  more  than  one 
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shift.  This  approach  could  result  in  a  large  number  of  columns,  so 

another  alternative  is  to  use  a  binary  representation  in  place  of  each 

X..  For  any  reasonable  value  of  IL,  this  is  a  feasible  technique. 

For  large  values  of  U . ,  the  requirement  that  the  X.  be  integer  is 

J  J 

probably  not  worth  the  computational  expense,  and  the  problem  should  be 
solved  as  a  continuous  LP.  Additional  considerations  such  as  overtime, 
days-off  scheduling,  part-time  workers,  over-  and  under-staffing,  etc., 
are  discussed  in  [Ref.  51]. 

F.  SALES  TERRITORY  DESIGN  (SPP) 

A  problem  facing  sales  managers  is  how  to  identify  which  customers 
should  be  included  in  a  given  sales  territory,  and  how  to  determine  the 
best  call  frequencies  for  individual  customers,  in  short,  how  to  allocate 
a  given  amount  of  the  time  of  several  salesmen  to  several  hundred  prospec- 
tive customers  so  as  to  maximize  sales.  Let  the  rows  be  customers.  Let 
the  columns  represent  p  sets  of  candidate  territories,  one  for  each  of 
the  p  salesmen.  Let  the  costs  reflect  the  potential  sales  response 
evaluations  for  a  particular  salesman  in  territory  j.  A  side  constraint 
is  necessary  to  ensure  that  only  one  territory  is  picked  *rom  each  of  the 
p  sets.  The  requirement  that  a  customer  can  appear  in  one  and  only  one 
sales  territory  makes  this  the  SPP. 

The  generation  of  candidate  territories  is  a  difficult  process  in 
itself.  Shanker,  et  al .  [Ref.  8]  suggest  a  procedure  which  involves 
solving  a  series  of  integer  programs.  One  ILP  selects  territories  which 
maximize  demand  potential  subject  to  a  series  of  workload,  stratification, 
and  compactness  constraints.  This  set  of  territories  is  in  turn  evaluated 
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in  another  ILP  which  maximizes  a  piecewise  linear  response  function 
subject  to  calling  frequency  constraints.  Subjective  considerations  can 
be  included  at  various  points  in  the  process  to  help  further  reduce  the 
number  of  candidate  territories  which  finally  appear  in  the  SPP. 
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APPENDIX  B.  PROPOSED  DATA  INPUT  FORMAT 

A  very  large  part  of  the  time  invested  in  this  research  has  been 
spent  manipulating  and  entering  problem  data  which  came  to  the  author  in 
many  different  forms.  The  sheer  size  of  the  data  sets  made  them  extremely 
unwieldy;  therefore,  it  was  decided  early  on  that  a  compact  format  for 
these  problems  could  make  data  manipulation  both  easy  and  convenient,  and 
would  encourage  other  researchers  to  adopt  this  format  as  a  standard. 
The  format  chosen  has  many  advantages  for  large-scale  problems. 

1.  It  is  compact,  listing  only  problem  dimensions,  constraint  ranges, 
cost  coefficients,  and  coefficient  addresses.  This  not  only 
reduces  Input/Output  time,  but  makes  it  possible  to  handle  quite 
large  data  sets  under  interactive,  time-sharing  systems  such  as  ISM 
CMS. 

2.  Storage  requirements  are  easily  calculated.  Problem  dimensions 
are  known  immediately  after  reading  the  first  card  image.  This 
eliminates  the  need  to  make  multiple  passes  of  the  data,  or  to  guess 
at  the  problem  size,  as  is  the  case  with  MPS  format  [Ref.  74]. 

3.  Data  Generation  Programs  are  simplified.  Row  and  column  labe^  are 
accommodated,  but  they  are  not  primary  keys,  thus  avoiding  aloha- 
numeric  manipulations  with  symbol  tables. 

4.  Column  manipulation  of   data  input  is  made  easy  S'nce  all  informati  :- 
for  each  column  is  contiguous. 

5.  This  column  format  is  easily  generated  by  commercially  available  [MPS) 
problem  generation  systems. 

The  data  input  format  consists  of  three  sets  of  card  images: 

1.  Problem  Dimensions.  Format  (316)  (One  Card) 

a.  M    =  Number  of  Rows 

b.  N    =  NumDer  of  Columns 

c.  NZEL  =  Number  of  non-zero  Elements. 
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2.  Constraint  Ranges.  Format  (2A4,  2E16.8)  (M  Cards) 

a.  IR  =  Row  Index 

b.  RL  =  Lower  Range  Limit 

c.  RU  =  Upper  Range  Limit. 

3.  Column  Data.  (N  or  More  Cards) 

a.  The  number  of  cards  needed  depends  upon  the  number  of  non-zero 
elements  in  each  column  (=  NCE).  The  format  for  the  first 
column  card  is  (2A4,  F14.3,  1015). 

1.  JC  =  Column  Index 

2.  C  =  Column  Cost  Coefficient 

3.  NCE  =  Number  of  Non-Zero  elements  in  the  column 

4.  IR  =  Row  Addresses  of  Non-zero  Coefficients. 

b.  If  NCE  is  greater  than  9,  additional  column  cards  are  needed 
to  hold  the  row  addresses  for  that  column.  The  format  for 
additional  column  cards  is  (20x,  1015). 
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TABLE  11.   INPUT  DATA  FOR  STEINER1 
AN  EXAMPLE  IN  PROPOSED  STANDARD  FORMAT 


17 

27         35 

1 

0. 

1000000 

2 

0. 

1000000 

3 

0. 

1000000 

4 

0. 

1000000 

5 

0. 

1000000 

6 

0. 

1000000 

7 

0. 

1000000 

8 

0. 

100000C 

9 

0. 

1000000 

10 

0. 

100 00 00 

1  1 

0. 

1000000 

12 

0. 

1000000 

13 

0. 

1000000 

14 

0. 

1000000 

15 

0. 

1000000 

16 

0. 

1000000 

17 

0. 

100  00  00 

18 

0. 

1000000 

19 

0. 

100  0000 

20 

0. 

1000000 

21 

0. 

1000000 

22 

0. 

100  00  00 

23 

0. 

1000000 

24 

0. 

1 000000 

25 

0. 

10C0OO0 

26 

0. 

1000G00 

27 

0. 

1000000 

28 

0. 

1C00OO0 

29 

0. 

1000000 

30 

0. 

100000C 

31 

0. 

1000000 

32 

0. 

1000000 

33 

0. 

1000000 

34 

0. 

1000000 

35 

0. 

100  00  00 

36 

0. 

100  00  00 

37 

0. 

100  0000 

38 

0. 

1000000 

39 

0. 

1000000 

40 

0. 

100  00  00 

4  1 

0. 

100  00  00 

42 

0. 

100  00  00 

43 

0. 

1000000 

44 

0. 

100  CO  00 

45 

0. 

lOOOOOO 

46 

0. 

100000C 

47 

0. 

1000000 

48 

0. 

10C0000 

49 

0. 

1000000 

50 

0. 

100  00  00 

51 

0. 

100  00  00 

52 

0. 

100  00  00 

53 

0. 

100  00  00 

54 

0. 

100  GO  00 

55 

0. 

1000000 

56 

0. 

1000000 

57 

0. 

100  00  00 

58 

0. 

10000  00 

59 

c. 

100  00  00 

60 

0. 

1000GOO 

1 

OD  +  O 

OD  +  G 
OD  +  O 
OD  +  G 
OD  +  O 
OD  +  G 
OD  +  O 
OD  +  0 
OD  +  0 
OD  +  0 
OD  +  O 
OD+0 
OD  +  O 
OD  +  0 
OD  +  O 
OD  +  O 
OD  +  O 
OD  +  O 
OD  +  O 
OD  +  C 
OD  +  O 
OD  +  O 
OD  +  O 
OD+O 
OD  +  O 
OD+O 
OD  +  O 
OD  +  O 
OD  +  O 
OD  +  O 
OD  +  O 
OD  +  G 
OD  +  O 
OD  +  O 
OD  +  O 
OD+u 
OD  +  G 
GD  +  G 
OD  +  O 
OD  +  G 
GD  +  O 
OD  +  O 
OD+O 
OD  +  O 
OD  +  O 
OD  +  C 
OD  +  G 
OD+O 
OD  +  O 
OD+O 
OD  +  O 
OD  +  O 
OD  +  O 
CD  +  C 
OD  +  G 
OD+O 
OD  +  O 
OD  +  O 
OD  +  O 
OD+O 


0. 

0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 

0. 
0. 
0. 
0. 
0. 
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00 

00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
GO 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
30 
00 
00 
00 
00 
00 
00 
00 

oo 

OG 
00 
00 

00 
00 

oc 

00 
00 
00 
00 
00 


0000 

0000 

000  0 

00  0  0 

0000 

000  0 

000  0 

GOOO 

OOuO 

0000 

000  0 

0000 

0000 

00  0  0 

00  0  0 

GOOO 

0000 

0000 

0000 

000  0 

OOGO 

0000 

0000 

0000 

0000 

000 

0000 

0000 

030  0 

coco 

0000 
00  0  0 
0000 
0000 
0000 
00  J  c 
0000 
0000 
000  0 
0000 
0000 

oo  Gi 

0000 
0000 
0000 

oooo 

00  OG 
00  0  0 
00  0  0 
OOOG 
00  0  0 
000  0 

ooco 
oooo 
oooo 
oooo 
oooo 
oooo 
oooc 
oooo 


00  +  2 'i 

'OD  +  2  1 
iOD  +  21 
OD  +  2 
OD  +  2 
IOD  +  2  'i 
IOD  +  21 
IOD  +  21 


0D<  _ 
OD  +  2 
>0D  +  2' 
'OD  +  2' 
iOD  +  2 
'OD+2' 
iOD  +  2 
IOD  +  2' 
'OD  +  2 
OD  +  2 
OD  +  2 
00  +  2 
00  +  2 
'OD+2' 
'OD  +  2  ' 
OD+2 
OD  +  2 
OD  +  2' 
OD  +  2' 
OD  +  2' 
-OD  +  2 


OD+k 

OD  +  2' 
OD  +  2' 
iOD  +  2' 
'OD  +  2 
'OD  +  2 
OD  +  2 
OD  +  2 
00+2 
OD  +  2 
OD  +  2' 
OD  +  2' 
OD  +  2' 
'OD  +  2  ' 
OD  +  2' 
00  +  2  ' 
OD  +  2 
OD  +  2 
:0D+2  ' 
IOD+2" 
'OD  +  2  ' 
OD  +  2 
OD  +  2' 
'OD  +  2 
OD  +  2 
OD+2 
OD  +  2 
OD  +  2 
OD+2 
OD  +  2 
OD  +  2 
OD  +  2 
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62 
63 
64 
65 
66 
67 
68 
69 
70 
71 
72 
73 
7U 
75 
76 
77 
78 
79 
80 
81 
82 
83 
84 
85 
86 
87 
88 
89 
90 
91 
92 
93 

9a 

95 

96 

97 

98 

99 

100 

101 

102 

103 

10a 

105 
106 
107 
108 
109 
110 
11  1 
112 
113 
114 
115 
116 
117 


0.  100 
0.  100 
0.  100 
0.  100 
0.  100 
0.  100 
0.  100 
0.  100 
0.  100 
0.  100 
0.  100 
0.  100 
0.  100 
0.  100 
0.  1  00 
0.  100 
0.  100 
0.  100 
0.  1G0 
0.  100 
100 
100 
100 
100 
100 
100 
0.  100 
0.  100 
0.  100 
0.  100 
100 
100 
100 
100 
100 
100 
100 
100 
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100 
100 
100 
0.  100 
0.  100 
100 
100 
100 
100 
100 
100 
100 
100 
100 
100 


0. 
0. 

0. 
0. 
0. 
0. 


0. 
0. 
0. 
0. 

0. 
0. 
0. 
0. 
0, 
0. 
0. 
0. 
0. 
0. 


0. 
0. 

0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 


0.  100 


OOO00D+0 
O0O00D+O 
000000+0 
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00  00  0D+0 
00000D+0 
OOOOOD+O 
OOOOOD+0 
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OOOOOD+O 
OOOOOD+O 
OOOOOD+O 
OOOOOD+O 
OOOOOD+O 
OOOOOD+O 
OOOOOD+O 
OOOOOD+O 
OOOOOD+O 
OOOOOD+O 

onoooD+o 

OOOOOD+O 
OOOOOD+O 
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